1. Introduction
With the advancement of new power system construction, cable lines, as the core carriers of urban grid energy transmission, directly affect power supply reliability through their operating status [
1,
2]. Statistical data shows that approximately 67% of grid faults can be attributed to long-term cable overheating, as abnormal temperatures accelerate insulation medium aging and increase the risk of thermal breakdown [
3,
4,
5]. Therefore, accurate perception and prediction of cable temperature hold significant engineering value for ensuring transmission safety.
In the field of cable temperature monitoring and prediction, research methods have gradually evolved from traditional approaches based on fixed thresholds and single-point temperature measurement to a multi-level technical system integrating digital twins and data-driven techniques [
6,
7,
8]. Traditional monitoring methods rely on fixed thresholds and single-point temperature measurement, making it difficult to adapt to dynamic environmental changes and spatial temperature distribution differences, which can easily lead to false positives and missed detections [
9,
10]. While classical physical models based on lumped parameter thermal circuits and finite element simulations have clear physical significance, they face issues of insufficient real-time performance and limited computational efficiency in complex practical conditions. To enhance the adaptability of models in practical applications, academia has conducted extensive explorations in the direction of physical fusion modeling [
11]. For example, to meet the real-time monitoring needs of long-distance 10 kV cables, Huang and Niu [
12] proposed a modular digital twin modeling method based on real-time temperature field inversion, utilizing modular reduced-order models to effectively evaluate transient temperature fields and contact resistance, achieving a good balance between model accuracy and real-time performance. Wang and Liu [
13] constructed a transient temperature model for cable insulation layers, analyzed the evolution patterns of radial temperature distribution under different loads using a finite element simulation system, and proposed a temperature rise rate of 10 °C/h as an overheat warning threshold, providing a theoretical basis for the warning system. Additionally, Kim et al. [
14] investigated the impact of ambient air and soil temperature on the heat dissipation performance of underground cables, verifying the improvement in heat dissipation efficiency by novel buffer materials, providing a reference for optimizing model boundary conditions.
In recent years, the introduction of deep learning technology has provided new research directions for cable temperature prediction, particularly demonstrating significant advantages in handling multi-factor coupling and high-dimensional time-series features [
15,
16]. Xu et al. [
17] proposed a short-term temperature prediction method that integrates temporal convolutional networks with backpropagation neural networks, utilizing TCN’s dilated causal convolution to extract long-term time-series dependencies. Experimental results showed that the average absolute error and mean squared error of the model were reduced by 13.3% and 12.1%, respectively, compared to those of traditional models. Zhao et al. [
4] constructed a bidirectional long short-term memory network model for cable temperature prediction in coal mining electrical equipment. After training based on electromagnetic–thermal coupled simulation data, the model’s coefficient of determination reached 0.9907, with an inference time of 45.44 ms and a maximum prediction error not exceeding 2.28%. Han et al. [
18] combined random forest feature importance scores with Gaussian process regression to propose an RF-GPR temperature prediction model, achieving an approximately 1500-fold computational efficiency improvement over the finite element method in digital twin scenarios, with a prediction accuracy maintained at 0.9911. In terms of model structure optimization, Yuan et al. [
19] employed the firefly algorithm to optimize the BP neural network, constructing a 110 kV high-voltage cable joint hot spot temperature inversion model. After training with mixed orthogonal design samples, the model’s correlation coefficient reached 0.99. Bao et al. [
20] designed a DC-CNN-PE-SSA-Informer hybrid model, combining dilated convolution to extract local features, capturing long-term dependencies through an improved position encoding Informer module, and optimizing parameters with the gray wolf optimizer. The model demonstrated superior prediction accuracy over traditional models in simulations. However, the model fundamentally still relies on the serial stacking of temporal modules to handle complex spatiotemporal correlations, failing to specifically model temporal dependencies and spatial correlations, which affects the model’s efficiency and accuracy in parsing multidimensional features.
To address these challenges, this paper proposes a novel multi-scale spatiotemporal network model (MSST-Net). The method systematically designs around three core issues: first, constructing multi-scale convolutional modules to integrate local features with different receptive fields, aiming to enhance the fine-grained capture of local patterns in temperature sequences; second, designing a spatiotemporal dual-path attention mechanism to decouple and collaboratively model temporal evolution patterns and spatial correlation characteristics, thereby overcoming the difficulties in long-range dependency modeling; finally, introducing relative position encoding to enhance sensitivity to temporal structures and integrating the Sparrow Search Algorithm to achieve automated global optimization of hyperparameters, reducing parameter tuning costs and improving the model’s generalization ability.
2. Materials and Methods
2.1. Experimental Data
This study focuses on the KN1 and KN2 sections in the central-northern area of Guangzhou’s utility tunnel (
Figure 1). The total length of this comprehensive utility tunnel is 1.619 km, including a dedicated power channel section (0.547 km) and a parallel dual-transmission section (1.072 km). It covers 13 monitoring areas in the planned core functional zone and is crucial for ensuring regional power supply.
Figure 2 shows the temperature data collected over the past two years from 1 January 2023, to 31 December 2024, with a sampling interval of 60 min, and each data point represents the daily average temperature. From
Figure 2, it can be observed that the temperature data exhibits distinct periodic characteristics, showing repetitive patterns over time, primarily driven by seasonal factors, making it suitable for time series forecasting.
The cable sheath temperature data were collected using a distributed optical fiber sensing (DOFS) system deployed along the KN1 and KN2 cable segments. The system operates on wavelength division multiplexing technology, offering continuous monitoring with a spatial resolution of 1 m and a temporal sampling interval of 60 min. It has a measurement range of −20 °C to 80 °C and a nominal accuracy of ±0.5 °C. This real-time monitoring is essential for identifying potential thermal anomalies and ensuring the safe operation of the medium-voltage cables within the urban distribution network.
2.2. Data Preprocessing
Raw sensor data is susceptible to environmental interference or equipment fluctuations, often resulting in outliers and missing values. To address this, preprocessing is required, with the workflow shown in
Figure 3.
First, use the Isolation Forest algorithm to remove outliers. This method isolates data points by constructing multiple random trees, automatically detecting anomalies, reducing manual intervention, and ensuring data reliability.
To address the dimensional differences among multi-source monitoring parameters such as temperature, current, and voltage, Min−Max normalization is performed to unify all parameters within the [0, 1] interval, thereby eliminating model weight imbalance caused by numerical magnitude differences and enhancing prediction stability. Based on 74,160 sets of normalized data, a 3:1 ratio is used to divide them into a training set (55,620 sets) and a test set (18,540 sets), which are used for model training and generalization capability validation, respectively.
To optimize input features, analyze the correlation between cable status and environmental factors. Cable temperature changes are not only affected by its own current and voltage but also related to parameters such as ambient temperature, humidity, oxygen concentration, and vibration amplitude in the utility tunnel. The Spearman correlation coefficient is used to evaluate the monotonic relationship between cable temperature and environmental factors. The parameters for Spearman correlation analysis were collected by professional monitoring equipment in KN1 and KN2 sections, with a sampling interval of 60 min (consistent with temperature data). Key details: (1) Current load. High-precision current transformers, accuracy class 0.2 S, uncertainty ±0.2 A. (2) Voltage. Capacitive voltage dividers, accuracy class 0.2, uncertainty ±0.02 kV. (3) Ambient temperature. PT100 sensors, accuracy ±0.3 °C. (4) Ambient humidity. Capacitive humidity sensors, accuracy ±2% RH. (5) Vibration amplitude. Piezoelectric acceleration sensors, uncertainty ±0.05 g. All equipment is calibrated quarterly to ensure data reliability. The Spearman correlation coefficient ranges from [−1, 1], and the closer its absolute value is to 1, the stronger the monotonic correlation between variables. The relationship between the absolute value of the correlation coefficient and the degree of correlation is shown in
Table 1.
The Spearman correlation coefficients between cable temperature and current load, voltage, ambient temperature, ambient humidity, vibration amplitude, zone position, row number, and wire number were calculated, with the results shown in
Figure 4. Current load exhibited a very strong correlation with cable temperature, voltage, ambient temperature, and ambient humidity showed strong correlations, vibration amplitude showed a moderate correlation, while zone position, row number, and wire number showed weak correlations. Therefore, five factors—current load, voltage, ambient temperature, ambient humidity, and vibration amplitude—were selected as the final model inputs to reduce noise interference and improve prediction efficiency.
2.3. MSST-Net Model Construction
This paper proposes the MSST-Net model to address key challenges in cable temperature prediction. The model is specifically designed for the three-dimensional characteristics of power cable monitoring data, with historical input sequences including current load, voltage, ambient temperature, ambient humidity, and vibration amplitude. Among these, sudden fluctuations in current load (such as short-circuit events) can cause second-level temperature spikes, while diurnal variations in ambient temperature exhibit hourly-level gradual trends. Additionally, the thermal conduction network formed by physical connections between multiple measurement points increases spatial dependency.
To address the three major challenges in cable temperature prediction—insufficient capture of local features, difficulty in modeling long-range dependencies, and high cost of hyperparameter tuning—this paper proposes a multi-scale spatiotemporal model called MSST-Net. It employs a multi-scale convolutional module to fuse local features with different receptive fields, compensating for the limitations of traditional Transformers in capturing local patterns. A dual-path spatiotemporal attention mechanism is designed to simultaneously model the temporal evolution patterns of temperature sequences and their spatial correlation characteristics. Relative position encoding is introduced to enhance the model’s sensitivity to temporal positions, overcoming the generalization bottleneck of absolute position encoding. The Sparrow Search Algorithm is integrated to achieve automatic hyperparameter optimization, reducing tuning costs and improving model generalization. The model architecture is illustrated in
Figure 5.
2.3.1. Multi-Scale Convolution Module
The dynamic evolution of cable temperature is influenced by the coupling of multi-scale physical processes: short-circuit faults or sudden load increases can trigger instantaneous mutations at the second scale, periodic line switching operations cause temperature fluctuations at the minute scale, while environmental factors form trends at the hour scale. Traditional single-scale convolutional models struggle to capture such cross-temporal features [
21]. This paper proposes a parallel dilated causal convolutional DC-CNN, as shown in
Figure 6, which models through three differentiated branches: the 1 × 3 convolutional kernel focuses on instantaneous temperature rises caused by minute-scale current mutations, the 1 × 5 convolutional kernel captures gradual temperature changes resulting from hour-scale load fluctuations, and the 1 × 7 convolutional kernel perceives long-term trends from daily periodic environmental temperature variations. Each convolutional branch is followed by a ReLU activation function to enhance the model’s fitting capability for nonlinear temperature evolution patterns. Features extracted from different scales are concatenated and fused, then passed through a batch normalization layer to eliminate distribution differences in sensor data, ultimately outputting unified multi-scale features.
To meet real-time prediction requirements, the convolution operation is constrained by a masking matrix to access only historical data, ensuring that the cable temperature prediction value yₜ at time t depends solely on historical detection data from the input sequence [t − k, t], as shown in Equation (1).
where d is the dilation rate;
represents the input at step i·d before time t;
is a lower triangular mask matrix, and
when i·d > t; ⊙ is the Hadamard product.
2.3.2. Spatio-Temporal Transformer
To effectively model the spatial heat conduction relationships between measurement points and the long-term temporal dependencies among variables, this paper proposes a spatiotemporal decoupled Transformer architecture, which first independently models temporal dynamics, then decouples spatial correlations, avoiding the parameter redundancy and overfitting risks associated with traditional spatiotemporal joint attention mechanisms.
As shown in
Figure 7, the temporal attention branch calculates dependencies among time steps within columns to capture the temporal evolution characteristics of cable temperature, as shown in Equation (2). The spatial attention branch calculates the correlations among multiple monitoring points, as shown in Equation (3). A relative position encoding matrix
is introduced in the spatiotemporal attention layer to enhance the model’s perception of temporal positions. The obtained dual-branch spatiotemporal features Ht and Hs are output after weighted fusion, as shown in Equation (4). This design achieves precise thermal dynamic modeling through a physically guided dual-path architecture.
where
, d is the dimension, k is the number of attention heads, used for normalization to prevent gradient disappearance;
is the relative position encoding matrix.
Considering that sudden load current changes may cause spikes in cable temperature, using Mean Squared Error (MSE) would excessively penalize large errors, leading to an overly conservative model. Therefore, this paper adopts Mean Absolute Error (MAE) as the loss function, as shown in Equation (5). MAE is more robust to outliers, helping the model maintain stable prediction performance under sudden operating conditions, which better aligns with the actual operating characteristics of cable temperature.
where N is the number of cable monitoring points; T is the future time step of prediction;
is the true temperature value of the i-th monitoring point at the j-th time step;
is the predicted temperature value of the i-th monitoring point at the j-th time step.
2.3.3. Joint Hyperparameter Optimization Based on SSA
The MSST-Net model contains multiple hyperparameters, such as combinations of convolutional kernel sizes and dilation rates, learning rates, hidden dimensions, etc. These parameters exhibit strong coupling relationships. For example, the structural configuration of the multi-scale convolutional module directly affects the residual dynamic characteristics that the subsequent Transformer module needs to model, making traditional grid search or random search inefficient for optimization. Therefore, this paper adopts the Sparrow Search Algorithm (SSA) for global joint optimization of hyperparameters.
Figure 8 shows the hyperparameter automatic optimization process of the cable temperature prediction model based on SSA [
22]. In the initialization phase, a set of hyperparameter combinations (such as CNN architecture, learning rate, hidden dimensions) are randomly generated to form the initial “population”. Individuals in the population are divided into discoverers, followers, and sentinels, and the hyperparameter combinations are updated according to role rules. After each update, the model’s fitness under the current hyperparameter configuration is evaluated using the validation set. When the early stopping mechanism is satisfied (reaching the maximum iteration count of 200 or when the error no longer significantly decreases), the loop terminates, and the current optimal hyperparameter combination is output, including the optimal convolution combination, learning rate, and hidden dimensions. If the termination conditions are not met, iterative optimization continues.
2.3.4. Test Environment Configuration
Based on Python 3.8.12 and PyTorch 1.12 framework, using Intel Xeon E5-2690 V4 processor and NVIDIA RTX 3080 Ti graphics card with 12GB VRAM, accelerated computation through CUDA 11.4.0 and cuDNN 8.2.4. The hyperparameters optimized by SSA are shown in
Table 2, with batch size Batch_size = 32 and training epochs = 200.
2.3.5. Evaluation Metrics
The coefficient of determination (R
2) and root mean square error (RMSE) were used as evaluation metrics for prediction accuracy [
23]. The calculation formula for R
2 is as follows:
where SSE is the sum of squared residuals; SST is the total sum of squares.
The calculation formula of Root Mean Square Error (RMSE) is as follows:
where n is the sample size.
4. Discussion
This study proposes the MSST-Net model, which effectively addresses key challenges in cable temperature prediction through multi-scale feature extraction, spatiotemporal dependency modeling, and SSA-driven automated hyperparameter optimization.
The performance improvement of the model primarily stems from its targeted design. The multi-scale convolutional module successfully captures features of cable temperature variations across different time scales—such as second-level abrupt changes and hour-level gradual fluctuations—by utilizing differentiated receptive fields, thereby overcoming the limitations of traditional single-scale models. The decoupled spatiotemporal modeling strategy, which separately processes temporal dependencies and spatial correlations, significantly enhances the model’s generalization capability. Furthermore, the SSA-based automated hyperparameter optimization strategy effectively improves model efficiency, significantly reduces the cost of manual parameter tuning, and enhances engineering practicality.
4.1. Uncertainty Analysis
The comparative performance of the models is influenced by distinct sources of uncertainty beyond prediction error. (1) Data uncertainty from sensor noise and missing values disproportionately affects linear models like ARIMA. MSST-Net mitigates this through robust preprocessing (Isolation Forest) and multi-scale convolutions that filter noise. (2) Model structure uncertainty, such as STGNN’s static adjacency matrix or Transformer’s absolute positional encoding, limits adaptability to dynamic thermal processes. MSST-Net employs a decoupled spatio-temporal Transformer with relative positional encoding to address this. (3) Hyperparameter uncertainty from manual tuning can cause performance variance in models like LSTM. MSST-Net uses the Sparrow Search Algorithm for global optimization, reducing this variability and contributing to its stable performance.
4.2. Model Generalization and Robustness Validation
To enhance the model’s generalization and robustness, two targeted analyses were conducted using the original dataset.
The dataset (74,160 samples) was split into four seasonal subsets, as illustrated in
Table 5, MSST-Net maintained stable prediction performance across all seasonal subsets. Specifically, the coefficient of determination (R
2) was no less than 0.918, the mean absolute error (MAE) ranged from 0.420 to 0.495 °C, and the root mean square error (RMSE) varied between 0.572 and 0.635 °C. The maximum fluctuation amplitude of key metrics was within 10.9%, indicating robust adaptability to seasonal variations in ambient temperature (−2–38 °C) and load peak-valley ratio (20–45%). This stability is attributed to the model’s multi-scale dilated causal convolution module and spatiotemporal decoupled Transformer architecture, which effectively capture cross-scale temporal features and independent spatiotemporal correlations, thereby confirming strong temporal generalization capability.
The 13 monitoring regions were divided into three subsets, as presented in
Table 6, the maximum difference in prediction performance across the three subsets was no more than 2.5%: the coefficient of determination (R
2) values were 0.938, 0.945, and 0.936, respectively; the mean absolute error (MAE) was 0.457 °C, 0.431 °C, and 0.462 °C; and the root mean square error (RMSE) was 0.612 °C, 0.589 °C, and 0.621 °C. The model’s low sensitivity to the quantity and spatial layout of monitoring points is attributed to its adaptive spatial attention mechanism, which does not rely on fixed adjacency matrices.
4.3. Limitations and Future Work
The current study has several limitations that should be acknowledged. First, the performance of the proposed model is strongly dependent on the quality of the historical monitoring data. Second, the validation is primarily based on data from two cable sections within a single utility tunnel; therefore, its generalization to other cable types, laying environments, or climatic regions requires further testing. To address these issues, future work will expand the validation to datasets from diverse geographical locations and more challenging operating conditions (e.g., fault scenarios), explore the integration of more external environmental factors, improve the model’s robustness in scenarios with imperfect or missing data.
5. Conclusions
For underground utility tunnel cable data, the MSST-Net model is proposed, which achieves precise prediction of cable temperature through multi-scale feature extraction, spatio-temporal dependency modeling, and SSA-driven automated hyperparameter tuning, effectively addressing issues such as inadequate local feature capture and difficulties in spatio-temporal dependency modeling. Comparative analysis with models such as ARIMA, LSTM, CNN-LSTM, STGNN, and Transformer shows that MSST-Net achieves a coefficient of determination (R2) of 0.942, mean absolute error (MAE) of 0.442 °C, and root mean square error (RMSE) of 0.596 °C on real datasets, verifying its accuracy in time series prediction. Compared with other models, MSST-Net can better adapt to cable data characteristics and exhibits superior prediction performance and generalization capability.