1. Introduction
The low-voltage distribution network represents the final stage of electricity delivery to end users, and its failure can directly affect the users. Among its critical components, distribution cabinets play a vital role in allocating load currents among outgoing circuits, where high currents typically flow [
1]. Electrical joints in low-voltage distribution cabinets are prone to overheating under complex operating conditions, caused by loose or oxidized connections, load imbalance among phases, and excessive harmonics.
As the current remains steady, the increase in local contact resistance currents results in concentrated Joule heating. The continuous rise in resistance (R) accelerates thermal energy accumulation, rapidly increasing the joint temperature. In severe cases, the temperature can reach a very high level within a short period [
2].
Harmonics can also contribute to such heating: on one hand, zero-sequence harmonics (e.g., 3rd, 9th, 15th) superimpose on the neutral line, drastically increasing its current and, thereby, enhancing Joule heating at neutral line joints; on the other hand, the skin and proximity effect induced by high-frequency harmonics raises the AC resistance of electrical joints. Additionally, three-phase imbalance may lead to a dramatic increase in the current of specific lines, causing abnormal Joule heating of the corresponding joints [
3,
4].
The heat dissipation in a distribution cabinet is generally limited, forming a positive feedback loop that exacerbates the overheating of electrical joints. This may lead to insulation degradation and even sustained smoldering or combustion once the surrounding materials are thermally stressed.
Compared with post-event remediation, short-term temperature-rise prediction based on historical and real-time data provides an actionable early-warning window before risks escalate into failures. This capability has become a key driver for transforming distribution-network maintenance from periodic inspection to condition-based and predictive maintenance [
5].
The widespread deployment of online joint temperature monitoring systems and the collection of operational data (e.g., current, ambient temperature, and humidity) in distribution cabinets has enabled the accumulation of long-term multivariate time-series data over multiple time scales, allowing joint temperatures to be predicted from historical information [
6].
However, such sequences often exhibit pronounced non-stationarity and multi-scale fluctuations. They are influenced by load switching, ambient temperature and humidity, and equipment heterogeneity. As a result, distributional shifts and abrupt transitions frequently occur, posing significant challenges to conventional statistical models that assume linearity and stationarity [
7,
8].
In the field of temperature prediction, a variety of time-series modeling approaches have been extensively explored, with the autoregressive integrated moving average (ARIMA) model, the gated recurrent unit (GRU) network, and the long short-term memory (LSTM) network representing the most widely studied methods [
9].
In scenarios involving short-term stationary data or sequences with distinct periodic fluctuations, ARIMA offers the advantages of computational simplicity and strong interpretability. However, its predictive capability is substantially limited when confronted with nonlinear fluctuations, abrupt shifts, or multi-scale patterns, as it struggles to capture complex dynamic relationships [
10].
The GRU model, as a streamlined variant of the RNN (recurrent neural network), incorporates update and reset gates that reduce parameter complexity while mitigating the vanishing gradient problem. This makes the GRU particularly effective for rapid modeling when dataset sizes are small or computational resources are constrained [
11]. Nevertheless, the GRU remains inadequate for capturing long-term dependencies. When applied to highly non-stationary sequences with multiple perturbations, its predictions often lag or underestimate extreme values [
12].
In contrast, the LSTM model introduces explicit memory cells and gating mechanisms (input, forget, and output gates), enabling the preservation of historical information over extended time horizons [
13]. This structural innovation makes LSTM particularly well-suited for temperature sequences characterized by long-range dependencies, weak periodicity, and superimposed random disturbances—such as the joint temperature rise in distribution cabinets, transformer oil temperature, or the thermal behavior of equipment under varying loads [
14]. The empirical evidence consistently shows that the LSTM achieves markedly higher accuracy than both the ARIMA and GRU in forecasting non-stationary and complex time series. For example, studies in transformer oil temperature prediction demonstrate that the LSTM reduces the root mean square error (RMSE) by approximately 60% compared to ARIMA [
15], while also outperforming the GRU in capturing rapid transitions and nonlinear upward trends [
16].
Since LSTM’s time-series computations involve extensive floating-point operations, the dynamic execution mode of the original TensorFlow framework is inefficient on edge devices without dedicated GPU support. The computational, memory, and power constraints of edge devices at distribution substations make it difficult for full-precision large-parameter deep models to meet the requirements of real-time processing and multi-point concurrent tasks [
17]. In a realistic scenario, the vast number of monitored joints in the low-voltage grid produces high-volume data. Centralizing all data processing and prediction in the cloud would impose a heavy burden on communication bandwidth and lead to prohibitive costs [
18]. Consequently, it is optimal to perform joint temperature prediction locally at the edge, near the devices.
To address these challenges, this work developed a sensing system based on Bluetooth Mesh IOT to collect multi-source input features (joint temperature, load current, ambient temperature, and humidity) for the prediction models [
19]. A Raspberry Pi 5 served as the core of the edge computation device. The LSTM model was trained with data collected from real-world field tests and laboratory experiments. The performance of the LSTM model was evaluated on the electrical joint temperature in a real low-voltage cabinet and was compared with the ARIMA and GRU models. A mixed-precision quantization scheme was applied to optimize the LSTM for lightweight deployment while keeping the network topology unchanged [
20]. By comparing the lightweight and original models, the study highlights the differences in model size, inference latency, and prediction accuracy.
This study proposes an edge-enabled lightweight LSTM model for short-term joint temperature prediction in low-voltage distribution cabinets. We demonstrated that this approach achieves superior prediction accuracy compared to traditional methods (ARIMA) and common deep learning models (GRU), while successfully leveraging mixed-precision quantization to ensure millisecond-level inference latency on a Raspberry Pi 5 edge device. The findings validate a practical and cost-effective technical pathway for implementing real-time predictive maintenance in large-scale low-voltage distribution networks.
Although LSTM-based models have been extensively investigated in various domains such as energy forecasting, machinery fault detection, and climate prediction, no previous study has reported the application of an edge-enabled LSTM for predicting the temperature of electrical joints in low-voltage distribution cabinets. In practical power distribution systems, the very large number of joints makes centralized temperature monitoring and analysis impractical. Therefore, developing an edge-based AI prediction model represents a necessary and novel approach, enabling localized data processing and predictive analytics directly at the cabinet level. This study thus provides a new perspective for implementing intelligent, distributed, and real-time monitoring in power systems.
The remainder of this paper is structured as follows:
Section 2 outlines the materials and methods, including the sensing system and data acquisition, the LSTM model methodology, and the lightweight optimization scheme;
Section 3 presents the experimental results and comparative analysis; and
Section 4 and
Section 5 provide the discussion and conclusions.
2. Materials and Methods
2.1. Mathematical Formulation of the Time-Series Prediction Problem
Mathematically, joint temperature forecasting can be abstracted as a multi-step time-series forecasting task. Let the historical observation sequence be
where
represents the d-dimensional input feature vector at time step
i, including electrical quantities (voltage, current, power), environmental quantities (temperature, humidity), and historical joint temperature. The prediction target is to obtain the joint temperature for the next H steps
where
denotes the predicted joint temperature at the k-th future time step.
To quantify the discrepancy between the predictions and the ground truth, commonly used metrics include the root mean square error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE):
where
represents the true observed value. Through these indicators, the performance of the prediction model in terms of accuracy and stability can be comprehensively evaluated.
2.2. Prediction System Architecture
An LSTM network comprises three key components—the input gate, forget gate, and output gate—which, respectively, regulate the inflow, retention (memory), and outflow of information, as illustrated in
Figure 1.
The node outputs of the LSTM network are computed as follows:
In the above equations, denotes the input feature vector at time step t, while and represent the hidden states at the previous and current time steps, respectively. The cell state Ct serves as the long-term memory that carries information across time steps, with Ct−1 denoting its value from the previous step. The forget gate ft determines the extent to which information from Ct−1 is retained, the input gate controls the incorporation of new information from the candidate cell state , and the output gate regulates the portion of the cell state that contributes to the hidden state output. The candidate cell state represents the new information proposed to be added to the memory cell. Matrices Wf, Wt, Wi, Wc, and W0 and bias terms bf, bt, bi, and bc correspond to the learnable parameters associated with each gate. The nonlinear activation functions σ(.) and tanh(∙) ensure that the gating operations and state updates remain bounded within appropriate ranges.
To further alleviate the cold-start error caused by random initialization, this work introduced a warm-start strategy based on power-difference matching [
21].
First, the power-difference vector is defined as Δ
xt = [Δ
Pt, Δ
St]. Here, Δ
Pt =
Pt − Pt−1, and Δ
St =
St −
St−1, where
Pt and
St denote the active power and apparent power at time step
, respectively. By employing a similarity measure (e.g., Euclidean distance or cosine similarity), the most similar historical segment is retrieved from the set of past operating conditions {X
(i)}. The optimal index is defined as
and the corresponding reference segment is expressed as
Finally, the hidden state at the last time step of the reference segment
is adopted as the initial hidden state of the current prediction window:
Compared with the conventional random initialization (h0,c0)~N(0,σ2), the proposed method transfers hidden state information from similar operating conditions, thereby enhancing the temporal alignment and prediction accuracy in peak–valley neighborhoods and rapid transition periods.
In this study, the application of the LSTM network not only effectively addresses the challenge of modeling long-term sequential dependencies in joint temperature prediction but also ensures training convergence and predictive robustness [
22]. Through its gating mechanisms, the LSTM dynamically adjusts the retention or forgetting of historical information in response to changes in the input signals, thereby accommodating non-stationary factors—such as current fluctuations and periodic variations in ambient temperature and humidity—that induce fluctuations in the temperature-rise trend.
2.3. Model Training Settings
All deep learning models were implemented in Python 3.10 using the PyTorch 1.10 framework. The LSTM, GRU, and ARIMA baselines were trained under the same data-splitting strategy (70% training, 15% validation, and 15% testing). For LSTM and GRU, the sequence length was set to 60 time steps, and the prediction horizon was defined as 6 h, unless otherwise stated.
The model parameters were optimized using the Adam optimizer with an initial learning rate of 0.001, and the mean square error (MSE) loss was adopted as the training objective. A batch size of 64 was used to balance the memory efficiency and convergence stability. Early stopping with a patience of 10 epochs was applied to avoid overfitting, and the maximum number of training epochs was set to 200.
To prevent data leakage and maintain experimental integrity, the dataset was first divided into training and testing subsets before normalization. The normalization parameters (minimum and maximum values) were calculated exclusively from the training data and then applied to the validation and testing sets using the same scaling transformation. During model evaluation, the predicted temperature values were inverse-transformed to restore their original physical scale prior to error computation, ensuring the comparability and interpretability of the RMSE, MAE, and MAPE metrics.
To incorporate the proposed warm-start mechanism, hidden and cell states were initialized using the retrieved reference states rather than random values, enabling faster convergence and reducing the cold-start prediction errors. All experiments were conducted on an NVIDIA RTX-series GPU server with 24 GB memory, ensuring efficient training and inference.
Compared with conventional random initialization, the proposed warm-start initialization accelerated the convergence by approximately 20% and reduced the early-stage prediction error by around 10%. This improvement was achieved by transferring hidden-state information from similar operating conditions, providing the network with a more suitable temporal context at the beginning of training. To ensure optimal training stability, the key hyperparameters—including the sequence length, learning rate, and batch size—were tuned using a grid-search strategy on the validation set. Within a ±10% variation range of these parameters, the model’s RMSE fluctuation remained below 3%, demonstrating that the warm-start strategy effectively enhanced both the convergence efficiency and prediction robustness.
2.4. Data Acquisition
Due to the scarcity of joint overheating and fault samples in real operating data, the data for training and testing the models were acquired from both real low-voltage and laboratory cabinets. To assess the predictive performance of the model under extreme working conditions, a simulated heating fault platform for distribution cabinet joints was designed and constructed in the laboratory, as shown in Figure 4a. This platform employed systems for adjusting the contact resistance and load current to generate abnormal joint states of varying severity, thereby reproducing the temporal joint temperature rise and the corresponding fault development process. The real-world experimental setup is displayed in Figure 4b, where the cabinet is installed near a 10 kV/400 V transformer.
The experimental principle was based on the heating mechanism of contact resistance: when the surface roughness, oxide film thickness, or clamping force of the contact changes, the contact resistance
will increase, causing the current flowing through the joint to generate additional Joule heat Q:
where
I represents the RMS value of the current through the joints, while
t is the energization time. When
increases to a certain extent, the local temperature of the joint will continue to rise and may cause thermal instability. In the experiment, by fine-tuning the clamping force and contaminating the contact surface,
is increased in stages within a safe range, thereby precisely controlling the temperature rise rate and peak value.
As illustrated in Figure 4a,b, the arrangement was as follows. The simulated joint module was placed at the center of the distribution cabinet, while the current regulation module was connected in series via low-resistance conductors to adjust the current. The joint temperature measurement module was affixed directly to the surface of the joint to monitor its temperature, and the ambient temperature and humidity measurement module was in the middle of the cabinet to record the internal environmental conditions. The current acquisition module was employed to measure the cable current. All data acquisition modules transmit signals via a Bluetooth Mesh network to a Bluetooth client module, which then forwards the data via the Raspberry Pi’s hardware serial port to the data collection module [
23].
A Bluetooth Mesh-based wireless-sensor network was developed for this work, consisting of multiple sensor nodes and a central client receiver. Each joint temperature sensor includes a PT100 platinum temperature probe (−50 °C to 250 °C, accuracy ±0.2 °C), an nRF52832 Bluetooth Mesh module, and a 3.7 V lithium battery for power supply. The nodes are installed at electrical–joint interfaces and secured with thermally conductive adhesive to ensure accurate heat coupling and reliable measurements.
The Bluetooth Mesh client receiver acts as a gateway that collects temperature data from all nodes and forwards the data to the edge server through a USB interface. The hardware modules and PCB layouts are shown in
Figure 2 and
Figure 3, respectively, illustrating the practical implementation of the temperature monitoring system for low-voltage distribution cabinets.
The self-developed sensors used to monitor the ambient temperature, humidity, and load current (20–200A) are shown in
Figure S1 and
Figure S2, respectively.
In the setup, the current acquisition module, current regulation module, and data collection module are positioned outside the temperature–humidity control enclosure. The conductors enter the enclosure through feedthrough holes in its outer wall. All other components are housed inside the temperature–humidity control enclosure.
In total, the monitoring system continuously collected joint temperature data for more than 15 days under both field and laboratory conditions, with a 3 min sampling interval. Each distribution cabinet included multiple measurement points, yielding over 20,000 valid temperature records following data preprocessing. This dataset effectively captured both short-term fluctuations and long-term heating trends across varying loads and ambient temperature and humidity in the cabinet. For model training, a sliding-window approach was adopted: each input sequence consisted of 60 consecutive time steps (corresponding to 3 h of historical data), which were used to predict the temperature evolution over the subsequent 6 h horizon. This configuration not only provided sufficient temporal variability to support model learning but also aligned with the practical time resolution requirements of edge devices.
Under the arrangement, graded contact resistance and load condition tests were conducted, yielding temperature time-series data covering the stages from normal operation to mild abnormality and further to severe abnormality/fault. Representative joint temperature-rise curves and stage divisions are shown in
Figure 4. It can be observed that, during abnormal stages, the curves exhibit pronounced nonlinear increases accompanied by multi-scale fluctuations. Near peak values, the temperature-rise rate increases markedly. Following the activation of protective actions or load mitigation measures, the temperature drops rapidly and transitions into a stable stage. This behavior shows good consistency with the field observations and provides a reliable basis for subsequent model training and extreme condition evaluation.
Figure 4.
Schematic diagram of experimental scenario. (a) Experimental setup in laboratory; (b) Experimental setup in real low-voltage distribution cabinet.
Figure 4.
Schematic diagram of experimental scenario. (a) Experimental setup in laboratory; (b) Experimental setup in real low-voltage distribution cabinet.
The dataset used in this study was collected through the same data acquisition system developed by our research group, consisting of two complementary subsets: a field dataset from outdoor low-voltage distribution cabinets in a district of Beijing, reflecting real operating conditions, and a laboratory dataset obtained under controlled fault-heating experiments. Both subsets shared identical sensing hardware, data acquisition protocols, and edge communication architecture (Bluetooth Mesh to Raspberry Pi gateway), ensuring consistency in format and quality. Data were recorded every 3 min from multiple measurement points for approximately 15 consecutive days. After preprocessing, more than 20,000 valid samples were obtained. To ensure reliable model training and evaluation, the dataset was divided chronologically into 70% for training and 30% for testing, maintaining the temporal integrity of the time series. Each data record contains the voltage, current, power, joint temperature, and ambient temperature variables. All features were normalized using the min–max method, and the model’s predictions were inverse-transformed before calculating the RMSE, MAE, and MAPE to ensure that the results retained their physical interpretability. Each training sample consists of 60 consecutive time steps (3 h of historical data) used to predict the following 6 h temperature evolution. This unified configuration enhances the comparability between the field and laboratory datasets and ensures a coherent and physically meaningful model input structure.
2.5. Data Preprocessing
The same preprocessing procedures were applied to both the field and laboratory datasets to ensure data consistency. Specifically, the field dataset, collected from outdoor distribution cabinets in a district of Beijing, represents real operating conditions, while the laboratory dataset provides controlled abnormal-heating sequences for model robustness verification. The preprocessing steps included missing-value interpolation, outlier correction using the Z-score method, and min–max normalization, which together ensured the integrity and uniformity of the dataset.
The dataset employed in this study was collected from outdoor distribution cabinets located in a specific district of Beijing, China, recording the dynamic variations in joint temperature during actual operation. Due to the complexity of the monitoring environment, the raw data inevitably contained missing points, outliers, and noise. If such data were fed directly into the prediction model without cleansing, the training process would likely become unstable, and the predictive accuracy could be degraded [
24]. Therefore, prior to modeling, systematic data preprocessing was required to ensure the reliability and stability of the input data.
For missing-value handling, linear interpolation was employed to impute the temperature time series. The underlying principle was to estimate each missing value using the two adjacent valid observations immediately preceding and following it:
This method filled the gaps while preserving the continuity of the overall trend. As shown in
Figure 5, the imputed sequence became smoother, preventing abrupt interruptions at the model input.
For outlier detection, the Z-Score method was introduced to identify and correct spikes or abrupt changes. The specific computation was given by
where
and
are the mean and standard deviation of the sequence, respectively. When
> 2.5, this point is determined as an outlier and replaced by the smoothed value at adjacent moments. In this way, the abnormal sharp peaks caused by the instantaneous interference of the sensor are effectively removed. As shown in
Figure 6, the processed sequence significantly reduces extreme abrupt changes, and the overall curve is more continuous.
Finally, to avoid the adverse effects of differing feature scales on model training, all the input variables—including voltage, current, humidity, and temperature—were normalized using the min–max scaling method:
The data were scaled to the [0,1] range. This process not only improved the numerical stability during training but also accelerated the model convergence. As shown by the overall comparison in
Figure 6, the sequence becomes smoother and more stable after missing-value imputation, outlier correction, and normalization, thereby providing high-quality inputs to the LSTM model.
After preprocessing, the data served as the final dataset for model training and subsequent prediction tasks.
2.6. Lightweight Model of LSTM
When deploying a full-precision (FP32) LSTM model on edge platforms such as the Raspberry Pi, the ARM Cortex-A72 quad-core CPU (without a dedicated GPU), along with the limited memory and bandwidth, often fails to meet the real-time constraints of online prediction due to inference latency and resource consumption. To reduce the computational and storage overhead while preserving predictive accuracy as much as possible, a mixed precision quantization scheme was adopted to lightweight the trained model.
Specifically, 8-bit integer (INT8) quantization was applied to the storage of weights and selected activations to compress the model size and memory access cost, half-precision floating point (FP16) was used in key matrix multiplication operations to maintain numerical stability and throughput, and INT8 operators were employed for non-critical path computations to further reduce latency. This scheme required no changes to the LSTM network topology, modifying only the numerical representation and operator precision, thereby facilitating stable deployment on resource-constrained devices.
Quantization and dequantization adopt affine mapping. Let the original floating-point tensor (weight or activation) be ω
f, the quantized integer tensor be ω
q, the scaling factor be
S, and the zero point be
Z. Then, the relationship between quantization and dequantization is
where the scaling factor (
S) and zero (
Z) point were given by the upper and lower bounds of the quantization interval:
We perform boundary clipping during implementation to avoid integer overflow or saturation:
Among them, for INT8, qmax= 127 and qmin= −128 are usually taken to balance the precision and inference efficiency. Weight quantization adopts a per-channel configuration; that is, Sk and Zk are independently estimated for each output channel k to suppress the quantization errors caused by channel–scale differences; activation quantization adopts a per-tensor configuration to simplify the inference path and reduce the overhead of boundary de-quantization. In the calibration phase, statistics of the validation data (such as min/max or quantile clipping) were used to estimate [ωf,min, ωf,max], and S and Z were selected with the criterion of minimizing the quantization errors (such as MSE/MAE); when the quantization sensitivity of individual layers is relatively high, a small number of steps of Quantization-Aware Training (QAT) can be used for fine-tuning to recover the precision loss.
During inference, integer cores and floating-point boundary operators co-ordinate. Taking the main multiply–accumulate operations in the LSTM gates (i, f, o, g) as an example, the inputs and weights were quantized and accumulated within the integer domain:
The layer outputs are dequantized at the boundaries and passed into the nonlinear operators:
In this process, σ(∙) and tanh(∙) were executed in the FP16/FP32 domain to ensure numerical stability, while accumulation was performed in the INT32 domain to avoid overflow. Subsequently, a combination of scaling factors was applied to complete the mapping from the integer domain to the floating-point domain. An upper bound can be given for the discrete approximation error introduced by affine quantization:
Smaller scaling factors (i.e., tighter dynamic ranges) reduce the maximum error of a single quantization step; however, an overly small range increases the risk of saturation/clipping, requiring a calibration-stage trade-off between the approximation error and saturation probability. Overall, the mixed-precision scheme achieved FP32-to-INT8 compression at the parameter level and FP32-to-FP16/INT8 precision reduction at the operator level, substantially reducing the model size, memory bandwidth requirements, and end-to-end inference latency, while maintaining floating-point precision in key nonlinear operations to suppress numerical instability.
4. Discussion
The experimental results of this study demonstrate that the LSTM-based joint temperature-rise prediction approach achieves high accuracy and good stability in short-term forecasting tasks. Even under transfer deployment across different distribution cabinet joints and edge-computing platforms, the model consistently captures the temperature-variation trends effectively. Nevertheless, a comprehensive analysis of both the data and the deployment outcomes has also revealed several issues and limitations that warrant further investigation.
First, in cross-distribution cabinet joint scenarios, the prediction accuracy deteriorates markedly over longer horizons. For the 6 h prediction task, the RMSE and MAPE values are comparable to those in the same cabinet scenario; however, in the 24 h and 3-day predictions, the error growth in the cross-cabinet case is substantially higher than that in the same cabinet case. This discrepancy suggests that the model’s generalization capability for long-term forecasting is constrained when confronted with inter-equipment differences, variations in load patterns, and disturbances from environmental factors. Similar phenomena have been reported in other load-forecasting studies based on recurrent neural networks, indicating that this limitation is likely to be a common challenge.
Second, although the mixed-precision quantization scheme substantially reduces the model size and inference latency, the neuron connections operating under lower numerical precision exhibit heightened sensitivity to extreme operating conditions. The experiments revealed that when the temperature profile contains sharp peaks or rapid transitions, the lightweight LSTM shows slight deviations in peak amplitude and phase response compared with the full-precision model. This behavior is likely associated with the reduced numerical resolution introduced by quantization, and the effect becomes more pronounced in long-horizon forecasting. Therefore, for stringent condition monitoring and early-warning tasks under extreme states, the model requires accompanying precision compensation mechanisms—such as hybrid calibration or error-correction net-works—when operating at reduced precision.
Third, although the experimental data encompass real operating conditions from different distribution cabinets, the sample distribution remains imbalanced, particularly with a limited number of instances representing severe faults and high-temperature anomalies. This imbalance constrains the validation of the model’s stability under extreme operating patterns and may introduce bias when extrapolating certain performance evaluation results to broader field conditions. Therefore, future research should focus on expanding the sample space by increasing the collection of extreme-condition data or employing approaches such as simulated-fault experiments, thereby enhancing the model’s robustness across diverse operating states.
Finally, although the lightweight deployment on the Raspberry Pi platform has achieved the goal of real-time prediction, its performance in multi-task parallel processing and long-term operational stability remains to be fully assessed. In practical power distribution network operations, edge devices often need to run multiple monitoring and control tasks simultaneously, where factors such as computational resource allocation, memory management, and power-consumption strategies can all affect the sustained performance of the prediction module. This highlights the need for future work to explore more efficient model architectures (e.g., structured sparsity, convolution–recurrent hybrid networks) and hardware acceleration solutions (e.g., FPGA–NPU combinations) to further enhance the adaptability of the deployment environment.