Design and Implementation of a Decentralized Node-Level Battery Management System Chip Based on Deep Neural Network Algorithms

Shiue, Muh-Tian; Ou, Yang-Chieh; Wu, Chih-Feng; Wang, Yi-Fong; Liu, Bing-Jun

doi:10.3390/electronics15020296

Open AccessArticle

Design and Implementation of a Decentralized Node-Level Battery Management System Chip Based on Deep Neural Network Algorithms

by

Muh-Tian Shiue

^1,*

,

Yang-Chieh Ou

¹

,

Chih-Feng Wu

²,

Yi-Fong Wang

¹ and

Bing-Jun Liu

¹

Department of Electrical Engineering, National Central University, Taoyuan 32001, Taiwan

²

Department of Electronics Engineering, Chang Gung University, Taoyuan 333323, Taiwan

^*

Author to whom correspondence should be addressed.

Electronics 2026, 15(2), 296; https://doi.org/10.3390/electronics15020296

Submission received: 30 November 2025 / Revised: 29 December 2025 / Accepted: 6 January 2026 / Published: 9 January 2026

(This article belongs to the Special Issue New Insights in Power Electronics: Prospects and Challenges)

Download

Browse Figures

Versions Notes

Abstract

As Battery Management Systems (BMSs) continue to expand in both scale and capacity, conventional state-of-charge (SOC) estimation methods—such as Coulomb counting and model-based observers—face increasing challenges in meeting the requirements for cell-level precision, scalability, and adaptability under aging and operating variability. To address these limitations, this study integrates a Deep Neural Network (DNN)–based estimation framework into a node-level BMS architecture, enabling edge-side computation at each individual battery cell. The proposed architecture adopts a decentralized node-level structure with distributed parameter synchronization, in which each BMS node independently performs SOC estimation using shared model parameters. Global battery characteristics are learned through offline training and subsequently synchronized to all nodes, ensuring estimation consistency across large battery arrays while avoiding centralized online computation. This design enhances system scalability and deployment flexibility, particularly in high-voltage battery strings with isolated measurement requirements. The proposed DNN framework consists of two identical functional modules: an offline training module and a real-time estimation module. The training module operates on high-performance computing platforms—such as in-vehicle microcontrollers during idle periods or charging-station servers—using historical charge–discharge data to extract and update battery characteristic parameters. These parameters are then transferred to the real-time estimation chip for adaptive SOC inference. The decentralized BMS node chip integrates preprocessing circuits, a momentum-based optimizer, a first-derivative sigmoid unit, and a weight update module. The design is implemented using the TSMC 40 nm CMOS process and verified on a Xilinx Virtex-5 FPGA. Experimental results using real BMW i3 battery data demonstrate a Root Mean Square Error (RMSE) of 1.853%, with an estimation error range of [4.324%, −4.346%].

Keywords:

deep neural network; SOC; deep learning; decentralized BMS

1. Introduction

With the global transition toward sustainable energy and the rapid development of the Electric Vehicle (EV) industry, the importance of electrical energy management technologies has become increasingly prominent. As the share of renewable energy continues to rise each year, how to effectively store and dispatch electrical energy has emerged as a central challenge in modern energy systems. Among various technologies, the BMS plays a pivotal role in ensuring system safety, performance, and longevity. The primary functions of a BMS are to monitor, protect, and optimize the operating conditions of battery packs, ensuring stable and efficient charge–discharge processes, preventing overcharging, over-discharging, and overheating, while simultaneously extending battery lifetime and enhancing energy utilization. Lithium-ion batteries—currently the most widely adopted energy-storage technology—offer several advantages, including high energy density, long cycle life, low self-discharge rate, and the absence of memory effects. These characteristics make them well suited for energy storage systems and electric vehicles [1,2]. To meet diverse voltage and current requirements, individual battery cells are typically configured in series–parallel arrangements to form modularized battery packs. This approach increases output power while providing scalability and facilitating maintenance. As system sizes expand and the number of battery cells grows to several hundred or even thousands, the BMS must go beyond basic measurements of voltage, current, and temperature. It must also perform complex state estimation tasks, such as State of Charge (SOC), State of Health (SOH), and State of Power (SOP) calculations [3,4]. This trend significantly increases the communication, computational, and data-management workload of the system, while also imposing more stringent requirements on the scalability, fault tolerance, and accuracy of BMS architectures.

Current BMS architectures can be categorized into three primary topologies based on control hierarchy and system configuration: (a) centralized, (b) distributed, and (c) decentralized architectures, as illustrated in Figure 1 [5]. In a centralized BMS, a single central controller is responsible for monitoring and managing all battery modules. This architecture offers advantages such as low cost and a simple communication structure. However, in high-voltage and large-capacity systems, it is susceptible to transmission delays and single-point-of-failure (SPOF) issues, resulting in limited reliability. A distributed BMS deploys slave controllers at the module level, allowing each module to independently perform measurement and data transmission, while a master controller aggregates and processes the information. This architecture provides improved scalability and flexibility, and alleviates communication bandwidth requirements. Nevertheless, due to constraints in component voltage ratings and microcontroller (MCU) capabilities, achieving high-precision monitoring at the individual cell level remains challenging.

In recent years, the decentralized BMS architecture has emerged as a promising direction for next-generation battery management systems. This architecture integrates the advantages of both centralized and distributed approaches and adopts a one-to-one node design, in which each individual battery cell is equipped with an independent monitoring and control node. Each node is capable of autonomously performing voltage, current, and temperature measurements, while also incorporating built-in state-estimation functions. Consequently, real-time decision-making can be executed without relying on a central controller. This design significantly enhances system performance in terms of communication latency and reliability. Even if certain nodes fail, the remaining nodes can continue operating normally, demonstrating exceptional fault-tolerance and system robustness. Moreover, the modularity and autonomy of decentralized architectures make them highly suitable for future large-scale energy-storage systems and electric-vehicle platforms, offering excellent scalability and ease of maintenance [6,7,8,9,10,11,12].

As energy-related applications continue to diversify, traditional centralized and distributed BMS architectures are becoming increasingly inadequate for handling the requirements of large-scale battery arrays, particularly in real-time monitoring, data synchronization, and high-reliability computation. Owing to its distributed intelligent nodes, real-time responsiveness, and strong fault-tolerant characteristics, the decentralized BMS architecture has become a key direction for future system development. Therefore, this work adopts the decentralized architecture as its foundation and proposes a battery-node chip that integrates independent sensing and intelligent state-estimation functions. The objective is to enhance system accuracy, scalability, and reliability while simultaneously reducing power consumption and hardware complexity, thereby providing a viable technical solution for next-generation smart energy-storage systems and EV energy management platforms.

For front-end signal acquisition, the proposed node chip employs a low-power continuous-time delta–sigma modulator analog-to-digital converter (CTDSM ADC) to measure battery voltage and current with high precision. The converter inherently provides anti-aliasing and low-noise performance. In the data-processing core, this research adopts a “node-level computing” approach by incorporating a deep-learning architecture with on-chip weight-update capability. Historical charge/discharge data are first processed through an offline training procedure for feature extraction and state identification. Only the updated parameters are transmitted back to the deep-learning hardware, enabling the chip to maintain the latest battery state features and perform real-time estimation accordingly. This adaptive mechanism supports long-term, high-accuracy SOC estimation while significantly reducing communication bandwidth requirements and easing the computational burden on the central controller. It also mitigates estimation errors caused by communication delays or interference in large-scale battery systems. Furthermore, low-speed serial buses or wireless protocols may be used for inter-node synchronization and cooperative control, allowing the system to maintain operational integrity even in the event of partial node failures. The proposed architecture effectively adapts to battery aging, temperature drift, and varying usage conditions, thereby enhancing system lifetime and overall energy-management efficiency.

In this paper, a deep neural network (DNN) model is adopted as the core data-processing engine at the battery node chip to predict the state of charge (SOC) of lithium-ion batteries in electric vehicles. In contrast to most existing studies on deep-learning-based battery state estimation, which primarily focus on inference performance at the software level or in offline environments and demonstrate high estimation accuracy under ideal computational resource conditions [13,14,15], relatively limited attention has been paid to hardware circuit implementations and system-level considerations under the practical constraints of node-level computation, including limited processing capability, memory capacity, and power consumption. To address this research gap, this work takes chip-level circuit architecture and hardware constraints as fundamental design premises, and accordingly plans and implements a DNN algorithm architecture suitable for deployment at battery nodes. The feasibility and robustness of real-time SOC inference under constrained hardware resources are systematically evaluated and statistically analyzed. Furthermore, by incorporating a feature-learning mechanism based on historical operational data, the proposed method is able to effectively capture the temporal evolution of battery states, thereby improving the stability of long-term SOC estimation.

The proposed physical inference chip architecture includes data preprocessing circuits, a momentum optimizer, a first-order derivative unit for the sigmoid function, and a weight-update module. The design is implemented using a TSMC 40 nm CMOS process and verified on an Xilinx Virtex-5 XC5VLX330 FPGA platform, with functional validation performed using design tools such as Design Compiler and IC Compiler. Finally, SOC estimation is evaluated using real charge–discharge data from a BMW i3 electric vehicle, achieving a root-mean-square error (RMSE) of 1.853%, with an error range of [4.324%, −4.346%].

2. Decentralized BMS Node-Chip Architecture

2.1. Node Chip Structure

The decentralized node-level BMS architecture provides high system flexibility and enhanced fault tolerance by assigning a dedicated BMS node to each individual battery cell. This one-to-one mapping enables fine-grained cell-level monitoring and supports scalable deployment in large battery strings. However, fully peer-to-peer decentralized systems typically introduce significant challenges related to system complexity, power consumption, communication reliability, and cost, particularly when strict consistency across cells is required.

In practical BMS operation, maintaining consistency in voltage, state of charge, and lifetime across all cells is essential. Excessive reliance on real-time inter-node cooperation or negotiation may increase vulnerability to communication latency, data loss, or electromagnetic interference, which can lead to desynchronization and degraded estimation accuracy in large-scale systems. To mitigate these issues, this work adopts a decentralized node-level BMS architecture with distributed parameter synchronization, where each node performs autonomous battery state estimation using shared model parameters, rather than relying on continuous inter-node collaboration.

Based on this design philosophy, a node chip with autonomous computation capability is proposed. The overall architecture of the proposed node chip is illustrated in Figure 2. The node integrates a low-power CTDSM ADC, a signal-isolation interface, and a battery state estimator implemented using a DNN. This architecture enables independent cell-level sensing and inference while preserving system-wide estimation consistency and scalability.

2.2. System-Level Flow

The node-level computing architecture proposed in this work consists of two main components: an offline parameter-training module (Training Part) and an on-chip real-time estimation module (Real-Time Estimate Part). The parameter-training module learns and updates feature parameters based on historical charge–discharge data of the battery. The resulting weight parameters are then transmitted back to the real-time estimation module, ensuring that its input weights and computational parameters remain synchronized with the latest battery conditions. This enables the real-time estimation module to accurately reflect the current state of the battery pack at all times [16,17,18]. The overall training flow is illustrated in Figure 3.

Figure 4 illustrates a concrete use-case scenario and the corresponding operation timeline of the proposed decentralized BMS. The system alternates between a hardware inference phase during vehicle operation and a software training phase during charging or idle periods. By maintaining identical neural network architectures in both environments, trained model parameters can be directly deployed to node chips without architectural mismatch.

2.3. Advantages

A deep-learning-based SOC estimation architecture can implicitly model the nonlinear behavior of batteries under varying operating conditions and aging states by learning from historical battery operation data. As a result, estimation errors caused by model mismatch and parameter drift can be reduced within specific operating ranges, while the accumulation of errors over time is effectively suppressed. Compared with conventional approaches based on equivalent circuit models or observer-based methods, such data-driven techniques offer superior model scalability. When future advancements in battery technologies (e.g., solid-state batteries) lead to changes in charge–discharge characteristics, the model can be adapted through retraining or incremental learning mechanisms without the need for complete model reconstruction.

Nevertheless, deep-learning-based SOC estimation involves trade-offs in terms of computational complexity, data dependency, and model interpretability. To mitigate the impact on node-level hardware resources and power consumption, this study employs the output of a CT ΔΣ ADC (CTDSM ADC) as the input to the neural network and adopts a structurally simplified model to support real-time inference. In addition, continuous model updates based on long-term accumulated operational data further enhance the stability of SOC estimation under battery aging conditions.

3. Low-Power Continuous-Time Delta-Sigma Modulator Analog-to-Digital Converter (CTDSM ADC)

To ensure accurate voltage and current measurement within the decentralized BMS nodes while maintaining low power consumption and high scalability, this work adopts a previously published low-power CTDSM ADC as the front-end sensing circuit of the node chip. The converter features high resolution, low noise, and low power consumption, enabling precise conversion of battery analog signals into digital inputs suitable for the neural-network-based estimator (NN-E), while achieving both energy efficiency and hardware integration. Furthermore, the input–output specifications of the ADC are planned according to the estimation accuracy requirements of the proposed NN-E. The related research results have been published in [5].

3.1. Role Within the Decentralized Node Architecture

In the decentralized BMS architecture proposed in this work, each battery node is equipped with independent measurement, estimation, and communication modules. The CTDSM ADC serves as the front-end sensing subsystem of the node, converting the battery voltage and current signals into high-resolution digital data. Its architecture is shown in Figure 5, and the digitized outputs are subsequently delivered to the NN-E for SoC prediction.

Because each node operates independently, its measurement accuracy and power consumption directly influence the overall estimation performance and scalability of the decentralized system. The CT − ΔΣ architecture leverages continuous-time integrators and noise-shaping techniques, providing inherent anti-aliasing capability and low-noise characteristics. This not only simplifies the front-end filtering requirements and reduces overall circuit complexity but also enables stable performance under low-voltage operation.

3.2. Overview of the CTDSM Architecture

The ADC adopted in this study employs a second-order biquad architecture, which consists of a continuous-time integrator, a 1-bit quantizer, and a non-return-to-zero (NRZ) digital feedback DAC, as illustrated in Figure 6. This structure achieves an optimal balance between area and power consumption while maintaining excellent dynamic range and signal-to-noise ratio (SNR) for low-frequency measurement applications (10 kHz).

To achieve high energy efficiency, the design adopts a current-reuse operational amplifier (OPA), which utilizes both PMOS and NMOS conduction currents to attain twice the transconductance efficiency of conventional architectures. This structure enables the amplifier to operate under a low supply voltage of 1.2 V while effectively suppressing 1/f noise and thermal noise. In addition, the ADC employs separated analog and digital ground configurations, and an on-chip bandgap reference (BGR) is used to provide a stable bias voltage, ensuring robust performance under temperature and process variations. These specifications are sufficient to support the resolution requirements (>12 bits) of the downstream NN-E model for accurate SOC estimation, while maintaining low power consumption and high reliability.

3.3. Integration with the NN-E Module

The output of the CTDSM ADC serves as the primary input data to the NN-E module in the decentralized battery node. The input to the neural network is taken from the ADC output after decimation filtering and down-sampling, rather than from the raw modulator bitstream. Following digitization, the data are processed through moving-average and normalization circuits for preprocessing, and are subsequently fed into the neural network estimator for SOC computation. As a result, the ADC provides stable, high-resolution outputs with sufficient precision to enable the NN-E model to effectively extract time-series features while mitigating the impact of quantization errors on estimation accuracy. This low-power, high-accuracy front-end sensing module can be replicated across multiple nodes in parallel without incurring significant increases in power consumption or chip area, thereby constituting an indispensable foundation of the overall decentralized BMS node chip architecture.

4. Deep-Learning-Based Neural Network Estimator (NN-E)

In this work, a deep neural network (DNN) is employed as the core of the node-level computational engine for battery state estimation. The DNN offers strong nonlinear function approximation capability, automated feature-extraction ability, and effective handling of high-dimensional and heterogeneous data, enabling the extraction of key battery features and the construction of accurate battery models. Using real charge–discharge data from lithium-ion batteries, the DNN performs feature extraction and subsequently estimates the battery state. Integrated into the decentralized battery management system, the proposed DNN-based estimator serves as the core computational unit for node-level battery state estimation.

4.1. Neural Network Architecture Planning and Design

A deep neural network (DNN) is composed of a multi-layer structure that typically includes an input layer, one or more hidden layers, and an output layer. Each layer consists of multiple neurons interconnected through adjustable weights, which propagate input signals sequentially to subsequent layers. During this process, the data undergo weighted summation followed by nonlinear transformation through an activation function [19], enabling the extraction of feature parameters and the construction of the battery model. The DNN architecture used in this work is illustrated in Figure 7.

Input Layer: The input layer receives the raw feature data and converts them into vector representations suitable for subsequent network computations. Each feature dimension corresponds to a single neuron in the input layer.

Hidden Layers: The hidden layers serve as the core structure for high-level feature extraction and abstraction. Neurons in each hidden layer are connected to those in the preceding layer through trainable weights, forming nonlinear mappings under the operation of activation functions. When two or more hidden layers are present, the network is considered a deep model, offering enhanced representational capacity and superior function-approximation ability.

Output Layer: The output layer generates the final model output, with the number of neurons determined by the task type. For instance, image-generation tasks require a large number of neurons to reconstruct pixel information, whereas regression tasks require only a small number of neurons to represent the output value.

In a DNN, every neuron connection is associated with a trainable weight, and each neuron also includes a bias term to adjust the baseline of the input signal. Activation functions introduce nonlinearity to the model, enabling it to learn complex data distributions and decision boundaries.

4.2. Training Process

Training a DNN model is an iterative multi-step process aimed at learning optimal weights from large amounts of training data so that the model can accurately predict unseen inputs [18]. The complete training workflow, illustrated in Figure 8, is described as follows:

Selection of Training Dataset: Relevant data containing input features and corresponding ground-truth values are collected and organized. The dataset is divided into training, validation, and testing subsets to ensure that the DNN can generalize to unseen data and that its performance can be properly evaluated.

Data Preprocessing and Input Shuffling: Based on battery characteristics, the input data are first processed using a moving-average filter, followed by normalization to scale all features into the [0, 1] range. The normalized data are then shuffled before being fed into the neural network.

Network Architecture Design: The neural network structure is determined by selecting the number of hidden layers, the number of neurons per layer, and other hyperparameters, including the optimizer. All network weights are initialized before training begins.

Hyperparameter Tuning: Model performance is evaluated using the validation set, and hyperparameters are adjusted accordingly to improve accuracy and convergence behavior.

Model Training: Through multiple iterations, the optimizer updates the model weights to minimize the loss function, gradually improving predictive performance.

Model Validation: After each training epoch, the model’s performance is evaluated on the validation set. Early stopping is applied when the validation performance reaches a satisfactory level. If accuracy remains insufficient after all iterations, the network structure must be redesigned.

Model Testing: Once the model is fully trained and validated, the test dataset is used to simulate real-world application scenarios and evaluate the final performance of the DNN model.

Construction of the DNN Model Architecture

In this study, the DNN model is constructed with the objective of improving battery state estimation accuracy while ensuring feasibility for hardware implementation. The offline parameter-training module is developed in Python 3.11, using FUDS and DST datasets obtained from Argonne National Laboratory to perform initial model training. Through iterative training and convergence, the DNN progressively refines its estimation capability, allowing the predicted battery states to closely match the actual measured states. The hyperparameters of the DNN architecture are determined based on the resulting RMSE performance. The overall training workflow is illustrated in Figure 9.

The offline parameter-training module is implemented in Python using the TensorFlow r2.12 framework. For the FUDS dataset, 80% of the data is used as the training set, and the remaining 20% is used as the validation set. The model inputs consist of voltage and current measurements. The number of hidden layers, the number of neurons per layer, and the optimizer are treated as variable hyperparameters. Early stopping is applied to prevent overfitting and ensure optimal model generalization [20,21,22]. Based on the training results, the final hyperparameter specifications used in the proposed DNN model are selected, as summarized in Table 1.

A relatively large learning rate and a small batch size of 1 are used to accelerate model convergence, while the training epochs are set sufficiently high to allow the model to fully learn the underlying patterns. The ReLU function is selected as the activation function for the hidden layers. Since the SOC value ranges from 0 to 1, the sigmoid activation function is used in the output layer to match this value range. Other hyperparameters—including the number of neurons, the number of hidden layers, and the choice of optimizer—are treated as variable parameters. Through multiple iterations and repeated simulations, the optimal configuration is determined based on the lowest RMSE.

4.3. Architecture Determination and Model Performance

Through an iterative training process, the hyperparameters and input dimensions at each level of the overall neural-network-based estimator are defined. The resulting configurations are summarized in Table 2, Table 3, Table 4 and Table 5.

Table 2, Table 3, Table 4 and Table 5 present the root-mean-square error (RMSE) and error range of SOC estimation obtained using three different optimizers under varying numbers of hidden layers and neurons. Based on these results, the momentum optimizer is selected as the optimal optimization scheme. The corresponding deep neural network architecture is configured with 8 neurons per layer and 2 hidden layers. Under this configuration, the software-based simulation achieves the best performance, with an RMSE of 1.37% and an error range of [3.67%, −6.46%].

Furthermore, simulation results indicate that by expanding the input dimensionality to 22 features and adopting a moving-average window size of MA = 1024, the RMSE can be further reduced to 0.44%, with the corresponding error range narrowed to [1.880%, −1.499%], as summarized in Table 6. The software-based inference results under this configuration are illustrated in Figure 10.

5. Battery Data and Feature Processing

To ensure that the simulation and experimental results closely reflect real vehicle operating conditions, this study incorporates open-source battery datasets from the Center for Advanced Life Cycle Engineering (CALCE) at the University of Maryland [23], as well as real-world charge–discharge data collected from electric vehicles by the U.S. Department of Energy’s Argonne National Laboratory [24]. These datasets are used to enhance feature training for the neural-network-based estimator and to improve the robustness of the model.

5.1. Data Sources

The battery data adopted from CALCE in this study were obtained by subjecting lithium-ion batteries to various driving-cycle tests. Driving cycles consist of sequences of data points that represent vehicle speed profiles and are designed to simulate real-world driving scenarios. Different driving cycles emulate different driving conditions, with commonly used examples including the Federal Urban Driving Schedule (FUDS), as shown in Figure 11, the Downtown Short Trip (DST), the Urban Dynamometer Driving Schedule (UDDS), and the US06 driving cycle. In addition, this study utilizes real-world battery measurements collected from a BMW i3 2014 electric vehicle (EV) by Argonne National Laboratory. Figure 12 shows the battery voltage, current, and SOC data used for the offline parameter-training module. Figure 13 presents the voltage, current, and SOC data used for validating the updated feature parameters within the on-chip real-time estimation module, enabling evaluation of the chip’s estimation accuracy.

5.2. Data Preprocessing Workflow

Data preprocessing is an essential step to ensure proper operation of the decentralized battery management system. Its primary purpose is to determine the valid charge–discharge operating intervals and to verify the sampling windows of the battery data. Based on the accuracy requirements of the neural-network estimator and the characteristics of the front-end CTDSM ADC output, the battery measurements undergo moving-average filtering and normalization. In addition, during the parameter-training phase, data shuffling is applied to prevent the model from overfitting to specific ranges of the dataset and to ensure that feature extraction remains robust and unbiased.

5.2.1. Data Preprocessing Workflow and Formatting

The moving average method is a technique used to smooth time-series data, helping reduce abrupt fluctuations and reveal the underlying trend. Applying a moving average filter to battery current and voltage data effectively suppresses sudden variations and measurement noise. This makes it easier to observe overall behavioral trends, such as gradual changes in current or voltage profiles. Given the large variability typically present in battery datasets, this study adopts the moving average method to smooth rapidly changing measurements. By applying the moving average filter, the input voltage and current signals become more stable, reducing excessive fluctuations while preserving essential signal characteristics associated with battery behavior. The moving average formula is expressed in (1):

X_{m o v, t} = \frac{X_{t} + X_{t - 1} + \dots + X_{t - n + 1}}{n}

(1)

where,

$X_{m o v, t}$ is the moving average data at time $t$ .
$X_{t}$ is the input data at time $t$ .
$n$ is the step.

5.2.2. Normalization

Normalization is applied to the input data features to ensure a standardized distribution, which helps prevent certain features from having excessively large or small values and thereby improves the effectiveness of gradient-based optimization methods. In this study, a min–max normalization scheme is employed. However, considering the correspondence between the range of the training data and the physical characteristics of the battery during inference, discrepancies may arise due to variations in driving environments or user driving behaviors, which can lead to different voltage and current operating ranges. Such mismatches between the normalization ranges of the training and inference data may adversely affect inference accuracy. Therefore, the normalization bounds are defined based on the physical charge–discharge characteristics of the battery: the voltage range is determined by the charge and discharge cutoff voltages, and the current range is defined by the maximum charge and discharge currents. Min–max normalization maps the original data X to a fixed range of [0, 1] while preserving the relative ordering of the original values. The corresponding formulation is given in (2):

X_{n o r m a l i z e} = \frac{X - X_{m i n}}{X_{m a x} - X_{m i n}}

(2)

where,

$X_{m a x}$ is the maximum value of the Battery Charging Cutoff. Voltage/Maximum Charging Current
$X_{m i n}$ is the minimum value of the Battery Discharge Cutoff. Voltage/Maximum Discharge Current

5.2.3. Data Shuffling

Data shuffling refers to the process of randomly permuting the entire training dataset prior to model training. The primary purpose of this step is to prevent the model from relying too heavily on data from a specific range, thereby reducing the risk of overfitting. By ensuring that the input data are evenly distributed across their domain, shuffling allows the model to learn uniformly from different segments of the dataset throughout the training process. This prevents the network weights from being biased toward particular data regions and ultimately enhances the robustness and generalization performance of the model.

6. Implementation and Verification of the Neural Network Estimation Chip

In this study, the real-time estimation component of the deep-learning architecture is implemented as a chip using the TSMC 40 nm CMOS process. The deep neural network—comprising two hidden layers with eight neurons each—is fully realized in hardware, and the trained parameters are subsequently uploaded to the chip to enable on-chip inference. Together with the front-end data preprocessing circuitry, the chip expands the input voltage and current signals into a 22-dimensional feature vector, forming the node-level computational core of the decentralized battery management system. The battery state estimation circuitry is illustrated in Figure 14, and the corresponding input and output signal definitions are summarized in Table 6.

6.1. Design Flow

Considering the combined requirements of estimation accuracy, performance, and chip cost, the hardware architecture is initially verified through Python-based simulation. The circuit functions and behaviors are then described using the Verilog HDL hardware description language, followed by simulation-based functional verification to ensure design feasibility. Finally, a cell-based ASIC design flow is carried out to implement the chip according to the specified requirements. The overall chip design flow is illustrated in Figure 15.

6.2. Circuit Architecture

The overall circuit architecture consists of the DNN computation core and the data preprocessing module. The internal components include the DNN structural blocks, ReLU activation circuitry, and a sigmoid function approximation circuit. In addition, the preprocessing module incorporates coefficient storage for the moving-average filter, a normalization circuit, and other auxiliary processing elements required for feature preparation.

6.2.1. DNN Circuit Architecture

After the overall architecture and computational scale of the DNN-based estimator core are planned in Python, the circuit implementation of each layer in the DNN is organized as shown in Figure 14.

Figure 16a shows the neuron circuit of the first hidden layer in the DNN test model. This layer receives a 22-dimensional input; therefore, 22 multipliers are used to multiply each input feature with its corresponding weight. The weighted results are then summed together and added with a bias term. As a result, the summation block contains 23 inputs and is implemented using a carry-save adder (CSA). The output of the summation is passed through a ReLU activation circuit. The first hidden layer consists of eight such neurons, whose outputs are forwarded to the input of the second hidden layer.

Figure 16b illustrates the neuron circuit of the second hidden layer. This layer receives an 8-dimensional input, requiring 8 multipliers followed by summation and a bias addition, resulting in a summation block with 9 inputs implemented using a CSA. The output is again processed by a ReLU activation circuit. The second hidden layer also consists of eight such neurons, and its outputs are fed into the input of the output layer.

Figure 16c presents the neuron circuit of the output layer. Similar to the second hidden layer, this layer receives an 8-dimensional input, processed through 8 multipliers and a summation block with 9 inputs implemented using a CSA, followed by bias addition. The output of this summation is then passed through a sigmoid activation circuit. The output layer contains a single neuron, and its output y represents the final SOC estimation produced by the DNN test model.

ReLU Circuit

Figure 17 shows the ReLU circuit. The most significant bit (MSB) of the input is used as the selection signal for a 2-to-1 multiplexer (MUX). When the input is negative (

i n < 0

), the MSB is 1, and the selection signal causes the MUX to output 0. Conversely, when the input is non-negative (

i n \geq 0

), the MSB is 0, and the MUX outputs the original input value. This implements the ReLU function

m a x (0, i n)

.

Sigmoid Function Circuit Architecture

In this study, the AS-SIM algorithm described in [25] is adopted to obtain the parameters and corresponding segmentation ranges of a piecewise linear (PWL) approximation of the sigmoid function. When the input

x \geq 8

, the output of the sigmoid function approaches its maximum value of 1. Moreover, the sigmoid function exhibits odd symmetry, satisfying

f (- x) = 1 - f (x)

, which implies that the negative-domain output can be derived directly from the positive-domain function. As a result, the required PWL approximation range can be reduced to [0, 8]. The AS-SIM algorithm utilizes an adaptive step size and iteratively refines the approximation until the error falls below a specified epsilon threshold.

In this work, the AS-SIM algorithm is executed with a maximum iteration count of

i t e r a t i o n_m a x = 200,000

, a sampling count of

n = 1000

, and an error threshold of

ϵ = 0.05 %

. The resulting model contains 12 PWL segments, each achieving an approximation error below 0.05%. Figure 18a shows the original sigmoid curve and its PWL approximation over the interval [0, 8], where the two curves nearly overlap. Figure 18b illustrates the approximation error across the same interval, demonstrating that the error remains below 0.05%.

After determining the PWL segmentation ranges and corresponding parameters, the hardware can be implemented using a single comparator array and two 16-to-1 MUXs. The comparator array identifies the interval in which the input falls, and the two 16-to-1 MUXs select the appropriate PWL parameters a and b for that interval, sharing the same selection signal.

The MSB of the input

i n

is used to determine whether the input value is positive or negative. Before generating the final output

o u t

, a 2-to-1 MUX selects between

f (x)

and its symmetric counterpart. Since the output represents the battery SOC and lies within the range [0, 1], the value

1 - f (x)

will also fall within the same range; thus, the hardware for implementing

1 - f (x)

can be simplified using an inverter in this design. The complete sigmoid function circuit is shown in Figure 19.

6.2.2. Data Preprocessing Circuit

Moving-Average Filter Circuit

Figure 20 shows the moving-average filter circuit implemented in this study. The design utilizes an adder with a bit-width capable of accumulating 1023 samples, and its equivalent computation is expressed in (3). When

t < 1024

, the term

X_{t - n + 1}

is set to 0, in which case (3) becomes equivalent to (1).

X_{m o n, t} = X_{m o n, t - 1} + \frac{X_{t} - X_{t - n + 1}}{n}

(3)

To minimize the overall chip area, the ARM IP memory provided by TSRI is utilized to replace the large register arrays that would otherwise be required for storing the voltage and current input data. The memory consolidates the storage needed for both signals, including 1024 words × 10 bits for current data and 1024 words × 14 bits for voltage data. The conceptual diagram of the combined memory structure is shown in Figure 21.

Normalization Circuit

Figure 22 illustrates the normalization circuit, which scales the input voltage and current signals to the range of 0–1. The values

X_{m a x}

and

X_{m i n}

used during both data preprocessing and model training correspond to the battery’s safety-defined charge/discharge cutoff voltage and maximum charge/discharge current. These values are converted into hardware parameters P1 and P2, respectively. In hardware implementation, the normalization operation can be realized using a single multiplier followed by an adder.

6.3. Chip Specifications

Chip Layout Diagram

The circuit is implemented using the TSMC 40 nm CMOS process, with design considerations focused primarily on power, performance, and area optimization. The workflow begins with Register-Transfer Level (RTL) design, followed by synthesis using Design Compiler, where Design for Testing (DFT) circuitry is also incorporated. The layout is then generated using IC Compiler to complete the physical design of the chip. The final chip layout and utilization results are shown in Figure 23, and the corresponding specifications are summarized in Table 7.

6.4. Comparison with Other Existing Technologies

The DNN chip developed in this study for battery state estimation is compared with other neural-network-based chip implementations, as summarized in Table 8.

7. Experimental Results

In the final stage of battery state estimation, the feature parameters obtained from offline training are uploaded to the real-time estimation chip. For practical SOC estimation, the battery pack voltage and current measured by the BMS are fed into the data preprocessing circuit. During preprocessing, the moving-average filter, normalization circuit, and multiple delay units transform the voltage and current signals into a 22-dimensional feature vector, which is then provided as input to the DNN test model.

The DNN test model consists of an input layer, hidden layers, and an output layer. Through the computation of these three layers, the SOC estimation value is produced. The estimated SOC is then compared with the ground-truth SOC to calculate the final RMSE and error range, which are used to evaluate the performance of the proposed model. Both the data preprocessing circuit and the DNN test model are implemented on the FPGA platform.

In this study, BMW i3 2014 EV Data 1 (SOC range from 90% to 10%) is used to train the battery feature parameters, while BMW i3 2014 EV Data 2 (SOC range from 56.6% to 9.9%) is used to validate the real-time estimation results. The relationship between the training and validation datasets is shown in Figure 24.

Finally, the DNN inference architecture is implemented on an FPGA platform. The experimental results, as shown in Figure 25, indicate that the RMSE converges to 1.853%, with an error range of [4.324%, −4.346%].

8. Conclusions

In addressing battery state estimation, this study targets lithium-ion batteries used in electric vehicles and introduces a deep-learning-based framework for SOC estimation. By leveraging measurements of dynamic battery physical quantities, a data-driven learning approach is adopted to guide the configuration of the DNN architecture, aiming to achieve the lowest possible RMSE and minimal error range with a compact network structure. Based on this optimized architecture, an edge-computing core is further designed and integrated with a CT ΔΣ ADC and an isolation coupling circuit to complete the overall decentralized BMS node circuit architecture. Through offline training, parameters capturing the characteristics of the entire battery string are updated and deployed onto the real-time inference chip, enabling the DNN to enhance both the accuracy and stability of SOC estimation.

Owing to the incorporation of a deep-learning-based network architecture, the proposed battery state estimation framework exhibits high flexibility. As battery technologies continue to evolve rapidly, the proposed system can readily adapt to different battery chemistries by extracting battery characteristics through offline training and deploying the updated models to decentralized BMS nodes.

In this work, the DNN architecture is implemented using a TSMC 40 nm CMOS process, employing only two hidden layers with eight neurons per layer and a total of 265 feature parameters. The resulting SOC estimation error is maintained below 4.4%, while the overall DNN computation core consumes as little as 64 mW at 2.5 V. Such low power consumption is highly suitable for decentralized BMS architectures that require large-scale deployment.

Author Contributions

Conceptualization, methodology, study design, M.-T.S., Y.-C.O. and C.-F.W.; software, figures, literature search, Y.-C.O., Y.-F.W. and B.-J.L.; validation, formal analysis, investigation, M.-T.S., Y.-C.O., C.-F.W., Y.-F.W. and B.-J.L.; data interpretation, writing—original draft preparation, M.-T.S. and Y.-C.O.; writing—review and editing, M.-T.S. and Y.-C.O.; supervision, project administration, funding acquisition, M.-T.S.; All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the Ministry of Science and Technology, Taiwan, under Grants MOST 110-2221-E-182-064-, 110-2221-E-008-100-, 110-2622-8-008-004-TA, 109-2221-E-008-073, and 109-2622-8-008-003-TA.

Data Availability Statement

The data presented in this study are available from publicly accessible repositories. These data were derived from the following resources available in the public domain: the Open Battery Data repository provided by the Center for Advanced Life Cycle Engineering (CALCE), University of Maryland, available at https://calce.umd.edu/battery-data, reference number [23]; and the Electric Vehicle Battery Testing Data provided by Argonne National Laboratory (ANL), U.S. Department of Energy, available at https://www.anl.gov/, reference number [24].

Acknowledgments

The authors would like to thank the Taiwan Semiconductor Research Institute (TSRI) and the National Applied Research Laboratories (NARLabs) for their support with the EDA tools.

Conflicts of Interest

The authors declare no conflict of interest.

References

Kurkin, A.; Chivenkov, A.; Aleshin, D.; Trofimov, I.; Shalukho, A.; Vilkov, D. Battery Management System for Electric Vehicles: Comprehensive Review of Circuitry Configuration and Algorithms. World Electr. Veh. J. 2025, 16, 451. [Google Scholar] [CrossRef]
Challoob, A.F.; Bin Rahmat, N.A.; Ramachandaramurthy, V.K.A.; Humaidi, A.J. Energy and battery management systems for electrical vehicles: A comprehensive review & recommendations. Energy Explor. Exploit. 2024, 42, 341–372. [Google Scholar] [CrossRef]
Wang, S.; Verbrugge, M.; Wang, J.S.; Liu, P. Multi-parameter battery state estimator based on the adaptive and direct solution of the governing differential equations. J. Power Sources 2011, 196, 8735–8741. [Google Scholar] [CrossRef]
Wang, S.; Verbrugge, M.; Wang, J.S.; Liu, P. Power prediction from a battery state estimator that incorporates diffusion resistance. J. Power Sources 2012, 214, 399–406. [Google Scholar] [CrossRef]
Shiue, M.-T.; Ou, Y.-C.; Li, G.-S. A Low-Power Continuous-Time Delta-Sigma Analogue-to-Digital Converter for the Neural Network Architecture of Battery State Estimation. Electronics 2024, 13, 3459. [Google Scholar] [CrossRef]
Zhang, F.; Rehman, M.M.U.; Zane, R.; Maksimović, D. Hybrid balancing in a modular battery management system for electric-drive vehicles. In Proceedings of the 2017 IEEE Energy Conversion Congress and Exposition (ECCE), Cincinnati, OH, USA, 1–5 October 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 578–583. [Google Scholar] [CrossRef]
Farjah, A.; Ghanbari, T. Early ageing detection of battery cells in battery management system. Electron. Lett. 2020, 56, 616–619. [Google Scholar] [CrossRef]
Yuanwen, L.; Da, Q.; Yifeng, D.; Jun, X.; Junyan, R. A 0.9-V 60-µW 1-Bit Fourth-Order Delta-Sigma Modulator with 83-dB Dynamic Range. IEEE J. Solid-State Circuits 2008, 43, 361–370. [Google Scholar] [CrossRef]
Barreras, J.V. Practical Methods in Li-Ion Batteries: For SIMPLIFIED modeling, Battery Electric Vehicle Design, Battery Management System Testing and Balancing System Control. Ph.D. Thesis, Aalborg University, Aalborg, Denmark, 2017. [Google Scholar] [CrossRef]
Zhao, X.; Xuan, D.; Zhao, K.; Li, Z. Elman neural network using ant colony optimization algorithm for estimating of state of charge of lithium-ion battery. J. Energy Storage 2020, 32, 101789. [Google Scholar] [CrossRef]
Zhang, Z.; Shen, Y.; Zhang, G.; Song, Y.; Zhu, Y. Short-term prediction for opening price of stock market based on self-adapting variant PSO-Elman neural network. In Proceedings of the 2017 8th IEEE International Conference on Software Engineering and Service Science (ICSESS), Beijing, China, 24–26 November 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 225–228. [Google Scholar] [CrossRef]
Shi, Q.; Zhang, C.; Cui, N.; Zhang, X. Battery state-of-charge estimation in electric vehicle using elman neural network method. In Proceedings of the 29th Chinese Control Conference, Beijing, China, 29–31 July 2010; IEEE: Piscataway, NJ, USA, 2010; pp. 5999–6003. [Google Scholar]
Yi, Y.; Xia, C.; Feng, C.; Zhang, W.; Fu, C.; Qian, L.; Chen, S. Digital twin-long short-term memory (LSTM) neural network based real-time temperature prediction and degradation model analysis for lithium-ion battery. J. Energy Storage 2023, 64, 107203. [Google Scholar] [CrossRef]
Yi, Y.; Xia, C.; Shi, L.; Meng, L.; Chi, Q.; Qian, L.; Ma, T.; Chen, S. Lithium-ion battery expansion mechanism and Gaussian process regression-based state of charge estimation with expansion characteristics. Energy 2024, 292, 130541. [Google Scholar] [CrossRef]
Zhang, Z.; Liu, C.; Li, T.; Wang, T.; Cui, Y.; Zhao, P. CNN-LSTM optimized with SWATS for accurate state-of-charge estimation in lithium-ion batteries considering internal resistance. Sci. Rep. 2025, 15, 29572. [Google Scholar] [CrossRef] [PubMed]
Shiue, M.T.; Ou, Y.C.; Liu, B.J.; Wang, Y.F.; Liu, P.H. The Battery Measurement Approach of Lithium Battery in Real Driving Data based on Deep Learning Algorithms. In Proceedings of the 11th Asian Conference on Electrochemical Power Sources (ACEPS11), Singapore, 11–14 December 2022. [Google Scholar]
Shiue, M.-T.; Wang, Y.-F.; Ou, Y.-C.; Liu, B.-J.; Liu, P.-H. A Deep Neural Network Structure for Li-Ion Battery State of Charge Estimation and Hardware Implementation in Electric Vehicles. Int. J. Ind. Electron. Electr. Eng. (IJIEEE) 2023, 11, 1–5. [Google Scholar]
Ou, Y.C.; Shiue, M.T.; Liu, B.J.; Wang, Y.F.; Kuo, C.S.; Wu, C.F. State of Charge Estimation and Circuit Implementation for Lithium Battery Based on the Elman Neural Network Algorithm. In Proceedings of the 2024 6th Global Power, Energy and Communication Conference (GPECOM), Budapest, Hungary, 4–7 June 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 98–102. [Google Scholar] [CrossRef]
LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; MIT Press: Cambridge, MA, USA, 2016. [Google Scholar]
Qiu, G.Q.; Zhao, W.M.; Xiong, G.Y. Estimation of power battery SOC based on PSO-Elman neural network. In Proceedings of the 2018 Chinese Automation Congress (CAC), Xi’an, China, 30 November–2 December 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 91–96. [Google Scholar] [CrossRef]
How, D.N.; Hannan, M.A.; Lipu, M.S.H.; Sahari, K.S.; Ker, P.J.; Muttaqi, K.M. State-of-charge estimation of li-ion battery in electric vehicles: A deep neural network approach. IEEE Trans. Ind. Appl. 2020, 56, 5565–5574. [Google Scholar] [CrossRef]
Center for Advanced Life Cycle Engineering (CALCE). Open Battery Data; University of Maryland: College Park, MD, USA, 2025. Available online: https://calce.umd.edu/battery-data (accessed on 21 March 2022).
Argonne National Laboratory (ANL). Electric Vehicle Battery Testing Data; US Department of Energy: Washington, DC, USA, 2025. Available online: https://www.anl.gov/ (accessed on 26 July 2022).
Chiluveru, S.R.; Chunarkar, S.; Tripathy, M.; Kaushik, B.K. Efficient hardware implementation of DNN-based speech enhancement algorithm with precise sigmoid activation function. IEEE Trans. Circuits Syst. II Express Briefs 2021, 68, 3461–3465. [Google Scholar] [CrossRef]
Moons, B.; Uytterhoeven, R.; Dehaene, W.; Verhelst, M. 14.5 envision: A 0.26-to-10tops/w subword-parallel dynamic-voltage-accuracy-frequency-scalable convolutional neural network processor in 28 nm fdsoi. In Proceedings of the 2017 IEEE International Solid-State Circuits Conference (ISSCC), San Francisco, CA, USA, 5–9 February 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 246–247. [Google Scholar] [CrossRef]
Teng, C.F.; Wu, A.Y. A 7.8–13.6 pJ/b ultra-low latency and reconfigurable neural network-assisted polar decoder with multi-code length support. IEEE Trans. Circuits Syst. I Regul. Pap. 2021, 68, 1956–1965. [Google Scholar] [CrossRef]

Figure 1. BMS Architecture Topology (a) Centralized, (b) Distributed, and (c) Decentralized.

Figure 2. Decentralized BMS Node-Chip Structure.

Figure 3. Training Flow and Weight-Update Process of the Node-Level Computing Architecture.

Figure 4. Timeline and workflow of the decentralized BMS with software–hardware co-design.

Figure 5. Basic Delta-Sigma Modulator.

Figure 6. CT Delta-Sigma Modulator Circuit.

Figure 7. DNN Architecture.

Figure 8. DNN Training and Validation Flowchart.

Figure 9. DNN Model Construction Flowchart.

Figure 10. Software-based inference results.

Figure 11. FUDS driving cycle data: (a) voltage curve, (b) current curve, and (c) SOC curve.

Figure 12. BMW i3 2014 EV driving data (Dataset 1) used for training: (a) voltage curve, (b) current curve, and (c) SOC curve.

Figure 13. BMW i3 2014 EV driving data (Dataset 2) used for estimation validation: (a) voltage curve, (b) current curve, and (c) SOC curve.

Figure 14. Top-level block diagram of the deep-learning–based estimation circuit.

Figure 15. Chip design flow.

Figure 16. DNN circuit architecture: (a) input layer, (b) hidden layer, and (c) output layer.

Figure 17. ReLU circuit.

Figure 18. (a) Sigmoid function curve and (b) approximation error curve.

Figure 19. Sigmoid function circuit.

Figure 20. Moving-average filter circuit.

Figure 21. Combined memory architecture.

Figure 22. Normalization circuit.

Figure 23. Chip layout.

Figure 24. Relationship between training data and validation data.

Figure 25. Real-time SOC estimation results.

Table 1. Hyperparameter Simulation Table.

Hyperparameter	Configuration
Number of neurons	Variable, range = [4, 14]
Number of hidden layers	Variable, range = [1, 4]
Optimizer	Momentum, AdaGrad, Adam
Learning rate	0.01
Batch size	1
Activation function of hidden layer	ReLU
Activation function of output layer	Sigmoid
Epochs	3000

Table 2. RMSE and Error Range of the Momentum Optimizer.

	1	2	3	4
Neurons	1	2	3	4
4	2.34% [7.32%, −6.52%]	1.51% [4.85%, −6.57%]	1.61% [4.93%, −5.78%]	1.48% [3.65%, −6.84%]
5	2.11% [6.56%, −7.04%]	1.67% [5.25%, −5.67%]	1.58% [4.83%, −6.51%]	2.04% [5.95%, −8.32%]
6	1.71% [3.85%, −7.82%]	1.54% [4.47%, −6.85%]	1.49% [5.46%, −4.87%]	1.51% [4.58%, −6.85%]
7	2.87% [6.80%, −8.10%]	1.44% [5.17%, −5.16%]	1.56% [5.91%, −4.56%]	1.47% [4.01%, −8.91%]
8	1.99% [3.47%, −8.67%]	1.37% [3.67%, −6.46%]	1.64% [4.37%, −6.12%]	1.49% [5.13%, −5.64%]
9	2.26% [4.04%, −8.83%]	1.52% [3.25%, −7.03%]	1.53% [2.42%, −9.51%]	1.78% [3.22%, −8.57%]
10	2.18% [7.85%, −6.98%]	1.72% [5.87%, −5.21%]	1.52% [3.32%, −7.78%]	1.67% [2.52%, −7.46%]
11	1.57% [3.60%, −7.73%]	1.71% [6.35%, −7.92%]	1.45% [4.24%, −6.49%]	1.50% [3.07%, −7.06%]
12	2.19% [3.29%, −8.80%]	1.51% [3.92%, −7.14%]	1.65% [5.13%, −7.05%]	1.49% [4.23%, −7.25%]
13	1.78% [9.85%, −7.30%]	1.56% [6.72%, −7.26%]	1.62% [3.21%, −7.40%]	1.51% [4.36%, −7.39%]
14	2.22% [6.01%, −5.32%]	1.87% [2.78%, −8.11%]	1.63% [3.63%, −7.44%]	1.52% [5.28%, −7.65%]

Table 3. RMSE and Error Range of the Adagrad Optimizer.

	1	2	3	4
Neurons	1	2	3	4
4	5.02% [9.10%, −10.51%]	1.56% [4.34%, −6.35%]	1.58% [4.18%, −6.67%]	4.90% [9.27%, −9.83%]
5	5.04% [9.13%, −10.25%]	1.68% [5.69%, −7.23%]	1.72% [6.45%, −7.69%]	4.90% [9.36%, −10.53%]
6	5.03% [9.21%, −8.56%]	1.54% [4.19%, −6.13%]	1.55% [4.70%, −6.22%]	4.97% [9.04%, −10.01%]
7	5.02% [9.31%, −9.83%]	1.52% [4.35%, −5.69%]	1.46% [4.22%, −6.23%]	1.51% [5.13%, −5.63%]
8	5.04% [9.44%, −9.67%]	1.51% [3.62%, −6.53%]	1.68% [5.90%, −5.72%]	1.48% [3.75%, −6.72%]
9	5.03% [9.38%, −9.61%]	1.53% [4.64%, −6.23%]	1.54% [3.80%, −6.51%]	1.47% [4.74%, −6.10%]
10	5.01% [9.20%, −10.01%]	1.51% [3.88%, −6.94%]	1.47% [4.79%, −5.72%]	1.62% [5.19%, −5.13%]
11	5.04% [9.85%, −10.49%]	1.42% [3.56%, −6.98%]	1.44% [4.31%, −6.55%]	1.42% [5.06%, −6.87%]
12	5.02% [9.41%, −9.68%]	1.45% [3.57%, −6.23%]	1.43% [4.49%, −6.44%]	1.44% [3.67%, −7.15%]
13	5.01% [9.05%, −10.37%]	1.42% [4.56%, −6.11%]	1.48% [3.49%, −7.32%]	1.48% [4.06%, −6.77%]
14	5.03% [9.42%, −9.61%]	1.52% [3.79%, −6.82%]	1.78% [4.01%, −7.08%]	1.56% [4.64%, −7.06%]

Table 4. RMSE and Error Range of the Adam Optimizer.

	1	2	3	4
Neurons	1	2	3	4
4	5.17% [10.15%, −10.48%]	1.64% [5.03%, −8.53%]	1.65% [3.83%, −6.88%]	1.64% [3.92%, −7.22%]
5	4.95% [9.11%, −9.75%]	1.62% [3.53%, −7.42%]	3.13% [10.61%, −4.15%]	1.63% [3.20%, −7.54%]
6	5.09% [8.51%, −11.96%]	2.23% [2.76%, −9.45%]	2.52% [7.87%, −3.81%]	1.64% [5.73%, −5.22%]
7	5.17% [9.49%, −15.33%]	2.16% [5.62%, −6.49%]	1.56% [3.37%, −7.35%]	1.54% [6.61%, −6.93%]
8	4.89% [7.74%, −7.93%]	2.26% [4.40%, −10.09%]	2.53% [7.58%, −3.79%]	1.87% [4.60%, −6.99%]
9	5.14% [9.94%, −8.72%]	2.20% [7.01%, −3.41%]	1.49% [3.03%, −7.34%]	1.79% [10.06%, −7.68%]
10	6.26% [7.58%, −17.71%]	1.56% [6.23%, −6.38%]	1.52% [3.59%, −6.98%]	1.82% [7.12%, −4.63%]
11	5.02% [9.39%, −12.45%]	2.34% [5.92%, −4.34%]	1.54% [6.91%, −3.12%]	1.66% [4.38%, −5.98%]
12	5.02% [9.50%, −11.03%]	2.01% [3.57%, −9.46%]	1.51% [4.11%, −7.28%]	1.75% [6.80%, −8.00%]
13	4.73% [5.39%, −11.74%]	1.49% [4.89%, −6.93%]	1.65% [4.34%, −6.89%]	1.88% [4.84%, −6.82%]
14	5.28% [10.41%, −8.88%]	1.53% [6.16%, −6.72%]	1.98% [2.58%, −8.66%]	1.58% [4.95%, −5.32%]

Table 5. Simulation results with expanded input dimensionality.

Input Feature	Input Dimension	RMSE	Error Range
Current(t), Voltage(t)	2	1.370%	[3.671%, −6.462%]
Current_mov (step = 100), Voltage_mov (step = 100), Voltage (t)	3	0.703%	[1.902%, −3.781%]
Current_mov (step = 100), Voltage_mov (step = 100), $Voltage (t) ~$ Voltage (t−9), $Current (t) ~$ Current (t−9)	22	0.580%	[2.179%, −2.081%]
Current_mov (step = 1024), Voltage_mov (step = 1024), $Voltage (t) ~$ Voltage (t−9), $Current (t) ~$ Current (t−9)	22	0.440%	[1.880%, −1.499%]
Current_mov (step = 2048), Voltage_mov (step = 2048), $Voltage (t) ~$ Voltage (t−9), $Current (t) ~$ Current (t−9)	22	0.443%	[1.561%, −2.149%]

Table 6. Input and Output Signal Definitions.

Signal Name	I/O	Width	Description
clk	I	1	Clock Signal
rst	I	1	Reset Signal
index	I	9	Weight Address Weight Write Enable
weight	I	13	Weight Data
voltage	I	14	Voltage Data
current	I	10	Current Data
SOC	O	12	Battery SOC

Table 7. The summary of chip.

Parameter	Specification
System	DNN testing model
Technology	TSMC 40 nm
Gate count (Core)	377,433
$Area [{μ m}^{2}$ ]	256,806
$Core size [{μ m}^{2}$ ]	466,994
$Pad core size [{μ m}^{2}$ ]	1,004,745
$Chip size [{μ m}^{2}$ ]	1,910,946
Power@voltage [mW]	40@0.9 V (Core)/ 64@2.5 V (Chip)
Max frequency [MHz]	448

Table 8. Comparison with other deep-learning chips.

	This Work	ENVISION ISSCC2017 [26]	TCES-I IEEE2021 [27]
Technology (nm)	40	28	40
Algorithm	DNN	CN	RNN-BP
Application	SOC estimation	Facial recognition	Communication
Frequency (MHz)	448	200	225
Chip Power (mW)	64@2.5 V (Chip)	300@1.1 V (Chip)	12.8@0.9 V (Chip)
Core Area (mm²)	0.257	1.90	0.18

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Shiue, M.-T.; Ou, Y.-C.; Wu, C.-F.; Wang, Y.-F.; Liu, B.-J. Design and Implementation of a Decentralized Node-Level Battery Management System Chip Based on Deep Neural Network Algorithms. Electronics 2026, 15, 296. https://doi.org/10.3390/electronics15020296

AMA Style

Shiue M-T, Ou Y-C, Wu C-F, Wang Y-F, Liu B-J. Design and Implementation of a Decentralized Node-Level Battery Management System Chip Based on Deep Neural Network Algorithms. Electronics. 2026; 15(2):296. https://doi.org/10.3390/electronics15020296

Chicago/Turabian Style

Shiue, Muh-Tian, Yang-Chieh Ou, Chih-Feng Wu, Yi-Fong Wang, and Bing-Jun Liu. 2026. "Design and Implementation of a Decentralized Node-Level Battery Management System Chip Based on Deep Neural Network Algorithms" Electronics 15, no. 2: 296. https://doi.org/10.3390/electronics15020296

APA Style

Shiue, M.-T., Ou, Y.-C., Wu, C.-F., Wang, Y.-F., & Liu, B.-J. (2026). Design and Implementation of a Decentralized Node-Level Battery Management System Chip Based on Deep Neural Network Algorithms. Electronics, 15(2), 296. https://doi.org/10.3390/electronics15020296

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Design and Implementation of a Decentralized Node-Level Battery Management System Chip Based on Deep Neural Network Algorithms

Abstract

1. Introduction

2. Decentralized BMS Node-Chip Architecture

2.1. Node Chip Structure

2.2. System-Level Flow

2.3. Advantages

3. Low-Power Continuous-Time Delta-Sigma Modulator Analog-to-Digital Converter (CTDSM ADC)

3.1. Role Within the Decentralized Node Architecture

3.2. Overview of the CTDSM Architecture

3.3. Integration with the NN-E Module

4. Deep-Learning-Based Neural Network Estimator (NN-E)

4.1. Neural Network Architecture Planning and Design

4.2. Training Process

Construction of the DNN Model Architecture

4.3. Architecture Determination and Model Performance

5. Battery Data and Feature Processing

5.1. Data Sources

5.2. Data Preprocessing Workflow

5.2.1. Data Preprocessing Workflow and Formatting

5.2.2. Normalization

5.2.3. Data Shuffling

6. Implementation and Verification of the Neural Network Estimation Chip

6.1. Design Flow

6.2. Circuit Architecture

6.2.1. DNN Circuit Architecture

ReLU Circuit

Sigmoid Function Circuit Architecture

6.2.2. Data Preprocessing Circuit

Moving-Average Filter Circuit

Normalization Circuit

6.3. Chip Specifications

Chip Layout Diagram

6.4. Comparison with Other Existing Technologies

7. Experimental Results

8. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI