1. Introduction
With the large-scale integration of distributed photovoltaics (PV), the power flow in active distribution networks exhibits bidirectional and stochastic characteristics. When distributed PV generation significantly exceeds local load demand and cannot be locally consumed, the risk of reverse power flow and subsequent limit violations surges dramatically. Concurrently, the risks of voltage exceeding upper limits during periods of high PV generation and dropping below lower limits during low generation periods become prominent, severely compromising user power quality. These operational challenges directly impact utility operations by increasing dispatch complexity, threatening grid reliability through potential protective device operations, and limiting PV hosting capacity due to conservative network constraints. There is an urgent need to assess power flow and voltage violation risks in active distribution networks, providing critical references for distribution network dispatch operations and distributed PV hosting capacity evaluation.
Existing research on risk assessment for distribution networks primarily focuses on two aspects: evaluation metrics and risk probability distributions. The work [
1] proposed a voltage violation risk assessment index and classified the risk into low, medium, and high levels based on the index, enabling dynamic power flow-based risk warning. The work [
2] designed a hierarchical risk assessment framework based on risk value theory, incorporating voltage violation indices. The work [
3,
4,
5] introduced indices based on utility theory, event severity, and fuzzy cost, respectively. However, most of these studies rely on single-dimensional metrics focusing solely on either security or economy, lacking a universal and standardized index system. The works [
6,
7] employed probabilistic power flow models based on Copula theory or Monte Carlo simulation to quantify the risk of node voltage violation. The work [
8,
9] further employed stochastic power flow models to address uncertainties from both supply and demand sides, highlighting the limitations of traditional analytical methods in handling high-dimensional dynamic variations. Collectively, existing approaches lack an efficient and comprehensive framework for assessing risks under multi-scale spatiotemporal uncertainties, creating a clear gap that necessitates the adoption of data-driven artificial intelligence techniques.
With the increasing penetration of distributed generation and enhanced load uncertainty in distribution networks, deep learning-based prediction models for active distribution networks have become a core technology for dynamic risk assessment. The works [
10,
11] employed Long Short-Term Memory (LSTM) models to predict power system violation probabilities and violation risk indices. The work [
12] utilized Gated Recurrent Units (GRU) integrated with meteorological data to achieve dynamic prediction of distributed PV output. However, it failed to account for the impact of dynamic load-side variations on the distribution network, making it difficult to meet the demands of dynamic risk assessment. Beyond classical LSTM and GRU, Temporal Convolutional Network (TCN) [
12,
13,
14] offers advantages in parallel processing and capturing long-range temporal dependencies, proving effective in identifying periodic and sudden risks in distribution systems. The work [
15,
16] utilized TCN for load and renewable generation forecasting to support voltage assessment and flow risk analysis, though performance under imbalanced sample conditions remains a challenge. The work [
17] proposed an integrated framework based on the Transformer model for predicting and risk-assessing the spatiotemporal distribution of heavy overloads in distribution networks under highly imbalanced and nonlinear data scenarios. The work [
18] used the Transformer model for deterministic point prediction of power system risk assessment indices but did not provide a method for assessing violation risks. Notably, practical compliance with national standards [
19] on voltage deviation requires interval-based prediction to quantify violation probabilities, calling for more informative and actionable risk evaluation methods.
To address this, the work [
20] integrated temporal deep learning models with the Lower Upper Bound Estimation (LUBE) method [
21]. This approach resolves the issue of interval prediction methods neglecting temporal dependencies, extending deterministic prediction models to provide a reference for dynamic assessment of power flow and voltage violations in distribution networks. Inspired by References [
18,
19,
20,
21], this paper addresses the power flow/voltage violation risks arising from high-penetration distributed PV integration into active distribution networks by constructing reasonable risk assessment indices combined with deep learning. On this basis, by analyzing the spatiotemporal characteristics of power flow/voltage in active distribution networks and constructing a modified IEEE 33-bus system for simulation experiments, the effectiveness of the proposed method is verified. The main innovations of this paper are as follows:
- To holistically evaluate violation risks caused by high-penetration distributed PV integration, this paper proposes the maximum positive/negative voltage deviation, and a composite power flow violation risk index. 
- Considering the spatiotemporal characteristics of voltage and power flow in active distribution networks, a risk assessment model based on TCN-Transformer is proposed to further explore the correlation between the risk of voltage/power flow limit violation and related physical quantities in active distribution networks. 
- Based on the physical significance of voltage and power flow violation risk indicators, different loss functions of neural network model are constructed to predict the numerical and interval values of power flow and voltage exceeding risk indicators, respectively, achieving accurate assessment of over limit risks in active distribution networks. 
The paper is structured as follows: 
Section 1 outlines the research background and significance, and reviews the current state of research on power flow and voltage violations in distribution networks. 
Section 2 analyzes the spatiotemporal characteristics of power flow and voltage in active distribution networks, providing a basis for modeling. 
Section 3 elaborates on the proposed evaluation method, including indicators, models, and procedures. 
Section 4 presents case studies to assess power flow and voltage violation risks, demonstrating the superiority of the proposed approach. 
Section 5 summarizes the research findings and suggests directions for future work.
  3. Power Flow and Voltage Violation Risk Assessment Method for Distribution Networks Based on Transformer Model
Through the above analysis, it can be concluded that in order to evaluate the risk of power flow and voltage violation under spatiotemporal coupling, it is necessary to establish reasonable risk assessment indicators for distribution networks. Moreover, deep learning models can capture the spatiotemporal dependencies of generation-load fluctuations, laying the foundation for dynamic risk assessment. Therefore, this section develops a violation risk assessment method for distribution networks by proposing voltage and power flow risk indices and leveraging deep learning models.
  3.1. Risk Assessment Indicators for Power Flow and Voltage Violation
  3.1.1. Risk Assessment Indicators for Power Flow Violation
The existence of power flow violation risks is determined based on the thermal stability limit of lines. The power flow violation rate 
 for the entire distribution network is represented by the ratio of the number of actual violation branches 
 to the total number of branches 
 in the distribution system:
Moreover, the relationship between risks caused by power flow violations and problem severity is often nonlinear, and power flow violations often lead to an exponential increase in operational risks for the distribution network. Therefore, a power flow violation severity index 
 is established based on a risk-preference utility function [
22]:
          where 
 denotes the maximum value of the branch power flow violation: 
 denotes the apparent power of the line, and 
 represents the maximum transmission capacity of the line.
By comprehensively considering both the extent and severity of power flow violations, the risk assessment indicator for power flow violation is yielded as 
:
By identifying and analyzing the key characteristic factors that influence power flow limit violation risks, a predictive value for the risk assessment indicator Risk of future power flow limit violations is generated. This enables the evaluation of the future power flow violation risk in the distribution system.
  3.1.2. Risk Assessment Indicators for Voltage Violation Risk
The voltage violation risk is quantified by employing the maximum positive voltage deviation 
 and the maximum negative voltage deviation 
 at each node in the distribution network:
          where 
, 
 denotes the voltage at Node 
; 
 refers to the rated voltage for the corresponding voltage level. All values are expressed in per-unit values.
By predicting the fluctuation range of 
 and 
, and comparing them with the voltage upper and lower bounds allowed by the national standard [
19], the risk of voltage violation in the future of the distribution system can be evaluated.
  3.2. TCN-Transformer-Based Model for Power Flow and Voltage Violation Risk Assessment
The TCN proposed by Colin Lea et al. [
12] offers long-term dependency modeling capabilities compared with those of RNNs, while also combining the stable gradients and high parallel computational efficiency of convolutional neural networks (CNNs). The Transformer model, introduced by Vaswani et al. [
23] in 2017, abandons traditional RNN and CNN architectures and is built on a self-attention mechanism, overcoming the limitations of inefficient training and inadequate long-range dependency modeling in RNNs. With its highly parallelizable structure, the Transformer significantly improves training speed. To address the spatiotemporal characteristics of voltage and power flow in active distribution networks, this paper integrates the advantages of TCN and Transformer models to construct a hybrid architecture: the TCN module efficiently extracts multi-level temporal features through causal dilated convolutions, ensuring gradient stability and computational efficiency; the Transformer module captures global spatiotemporal dependencies via self-attention mechanisms.
TCN module processes the input features to extract temporal patterns, which are then fed into a Transformer encoder composed of multi-head self-attention, layer normalization, and feed-forward networks. The TCN output 
 is computed as:
        where 
 is the input feature matrix; 
 is the number of convolutional kernels; 
 is the dilation rate; and 
 represents the kernel weights.
Following the TCN module, the Transformer encoder captures global spatiotemporal dependencies via self-attention mechanisms. The Transformer’s output 
 is given by:
        where 
 is the input to the Transformer, which is the output of the TCN module; 
 denotes the multi-head self-attention mechanism; 
 is layer normalization, and 
 represents the feed-forward network.
The model structure is shown in 
Figure 5. Input features are first processed by the TCN module to extract temporal patterns and then fed into a Transformer encoder composed of multi-head self-attention, layer normalization, and feed-forward networks. This enables the synergistic integration of local feature preservation and global correlation modeling, effectively capturing the evolution patterns of voltage and power flow violation risks.
As specified by the risk assessment metrics defined in 
Section 3.1, the assessment of the risk of power flow and voltage violation in the distribution network relies on point forecasts and interval forecasts of the corresponding risk indicators, respectively. Accordingly, this work integrates the Lower Upper Bound Estimation (LUBE) method with the TCN-Transformer architecture by designing different loss functions, and uses the deep neural network to learn potential patterns between data, providing accurate violation risk assessments.
For point forecasting of the power flow violation risk indicator, the Mean Squared Error (MSE) is adopted as the loss function:
        where 
 denotes the number of samples; 
 and 
 represent the true and the predicted values of the 
 sample, respectively.
On the other hand, the loss function for interval forecasting of the voltage violation risk assessment indicator is formulated by comprehensively considering the Prediction Interval Coverage Probability (PICP) and the Prediction Interval Normalized Average Width (PINAW). Since both over-voltage and under-voltage risks in the distribution network must be considered, this paper configures the maximum positive and negative voltage deviations as the two outputs of the model. The corresponding loss function, denoted as 
, is defined as follows:
        where 
 represent the maximum positive and voltage deviations, respectively; 
 denotes the number of target sets; 
 and 
 are the predicted upper and lower bounds of the 
 type deviation interval for the 
 sample 
, respectively; 
 and 
 represent penalty coefficients; 
S denotes the difference between the maximum and minimum values of the sample targets; 
 is the true value of the 
 type deviation for the 
 sample; 
 is the softening coefficient, and the target Prediction Interval Nominal Coverage (PINC) is a pre-set parameter.
  3.3. TCN-Transformer-Based Assessment Framework for Power Flow and Voltage Violation Risks
Building upon the two aforementioned loss functions, this section presents a TCN-Transformer based methodology for assessing power flow and voltage violation risks in distribution networks. The specific assessment procedure is illustrated in 
Figure 6.
As the spatiotemporal analysis of distribution network power flow and voltage characteristics indicates, limit violation risks are influenced by generation-load side power fluctuations. Therefore, this study selects eight input features for the risk assessment model: active power of conventional generation (PG), reactive power of conventional generation (QG), active power of wind power (PW), reactive power of wind power (QW), active power of photovoltaic generation (PPV), reactive power of PV generation (QPV), active load power (PL), and reactive load power (QL). These inputs are used to compute the maximum positive/negative voltage deviations, and power flow violation risk indicator at each time step.
To eliminate scale effects among variables in the dataset, Z-Score normalization is applied:
        where 
 represents the original input value; 
 denotes the mean; and 
 represents the standard deviation.
A sliding time window approach is employed to construct the sample dataset, with a window length of . Each data sample consists of an input feature matrix over  consecutive time steps, along with the corresponding next-time-step values of either the maximum positive/negative voltage deviations or the power flow violation risk indicator. The sample dataset is then partitioned into training, validation, and test sets in a ratio of a:b:c.
Subsequently, the hyperparameters of the TCN-Transformer model—such as kernel_size, tcn_channels, d_model, nhead, and dropout—are initialized. The appropriate loss function is selected based on the specific assessment task. The model is trained using the training set, and at the end of each epoch, the validation set is used to evaluate performance and determine whether the current optimal model has been obtained via the Adam optimization algorithm.
Finally, the test set is fed into the optimal model to obtain either forecasting values of the power flow violation risk indicators or prediction intervals for the maximum positive/negative voltage deviations. For the predicted values of the power flow risk indicators, a higher value indicates greater risk of power flow violation, and vice versa. Additionally, the performance of the point prediction model for power flow risk is evaluated using the metrics Mean Absolute Error (
), Root Mean Square Error (
), and Coefficient of Determination (
), as defined below:
For the prediction intervals of the maximum positive and negative voltage deviations, the indicators PICP and PINAW are selected as evaluation metrics to assess the performance of the interval forecasting model. Furthermore, by comparing the predicted voltage fluctuation intervals with the safe operational thresholds, the following conclusions can be drawn. In the case of the maximum positive voltage deviation, if both the upper and lower prediction bounds exceed the safe threshold, there is a very high risk of overvoltage; if the upper bound exceeds the threshold while the lower bound remains within it, a moderate overvoltage risk is indicated; and if neither bound surpasses the threshold, the overvoltage risk is considered very low. Similar conclusions can be drawn regarding the maximum negative voltage deviation.
  4. Case Study
To validate the effectiveness of the proposed risk assessment method for limit violations, case studies on both voltage and power flow violation risks in a distribution network are conducted. All experiments in this study are implemented based on the PyTorch (version 1.10.2+cu102) framework. The experimental environment consists of an Ubuntu18.04.6LTS operating system, a Montage Jintide® C5218R CPU (Montage Technology Co., Ltd., Shanghai, China), and Python 3.6.9. Additionally, due to the extensive matrix operations involved, the NumPy (1.19.5) and Pandas (1.1.5) libraries are used in Python, while the Joblib (1.1.0) library is employed to save and load scalers and models, and Plotly (5.4.0) is used for visualization.
  4.1. Dataset Generation
Based on a modified IEEE 33-node distribution network model, an active distribution network simulation model is developed in MATLAB (2023b), as illustrated in 
Figure 7. To incorporate temporal characteristics, multiple operational scenarios are constructed by combining various conditions, including no photovoltaic (PV) generation, low PV generation, high PV generation, light load, and heavy load. Typical daily load profiles and PV output curves for different seasons are considered. Annual time-series data of conventional generation, centralized wind power, distributed PV output, and load power are used as inputs, with a time step of 15 min. The corresponding outputs include the future maximum positive/negative voltage deviations, and the comprehensive power flow violation risk indicator. Meanwhile, considering the spatial characteristics, the distributed energy resources are connected to 4 different feeders to simulate various integration locations, such as the beginning, the end, and dispersed points along the feeders.
The parameters of the simulation model are summarized in 
Table 1. Based on the simulated data, a dataset comprising 35,040 samples—each consisting of an input feature matrix and an output label matrix—is constructed. This dataset comprehensively captures the operational characteristics of active distribution networks, providing substantial data support for training and validating the proposed risk assessment model for distribution network limit violations.
  4.2. Case Study on Power Flow Violation Risk Assessment
To validate the effectiveness and superiority of the proposed model for power flow violation risk assessment, comparative experiments are conducted using LSTM, Transformer, and TCN-Transformer models with the loss function defined in Equation (11). The dataset generated in 
Section 4.1 is processed using a sliding time window with a length of 16 to construct the sample dataset, which is then divided into training, validation, and test sets in a ratio of 6:2:2. This resulted in 21,015 training samples, 7006 validation samples, and 7006 test samples. The input consists of 128 feature values formed by the 8 input features from the previous 16 time steps, and the output is the power flow violation risk indicator for the next time step.
The parameter settings of the proposed model are summarized in 
Table 2. For fair comparison, the corresponding hyperparameters in the Transformer and TCN-Transformer models are kept consistent with those of the proposed model. The LSTM model is configured with hidden_size = 64 and num_layers = 2, while all other parameters remained the same as in the proposed model. These models are trained for 200 epochs with a learning rate of 0.001.
On a randomly selected day in the test set, 
Figure 8 shows the prediction curves of power flow violation risk indicators for different models. During normal operation, the curves remain around zero, indicating minimal risk. Between sampling points 15–20, a sharp increase occurs in all curves, reflecting serious violations. As shown in the enlarged view, the proposed TCN-Transformer model demonstrates superior alignment with the ground truth compared to other three models. Furthermore, between sampling points 50–55, the proposed model exhibits predictions closer to zero and smoother behavior under normal conditions.
To comprehensively evaluate the performance of each model, 
Table 3 presents a comparison of the prediction performance metrics of all models on the entire test set. The results demonstrate that the TCN-Transformer model achieves the best performance in terms of both prediction error and model goodness-of-fit. Specifically, it attains values of 0.651 for RMSE, 0.059 for MAE, and 0.761 for R
2 across the three key metrics. These results confirm the superiority of the proposed TCN-Transformer model in assessing power flow violation risks.
In summary, the proposed model can achieve best prediction performance. The power flow violation risk indicator predicted by the TCN-Transformer model quantitatively reflects the security level. A value of 0 indicates no risk of power flow violation in the distribution network, while a larger value corresponds to a higher risk of power flow violation.
  4.3. Case Study on Voltage Violation Risk Assessment
Based on the aforementioned models, this subsection conducts comparative experiments for voltage violation risk assessment with the loss function specified in Equation (12). The parameter configurations of all models remain consistent with those in 
Section 4.2. The input continues to consist of 128 feature values, while the outputs are the maximum positive voltage deviation 
 and the maximum negative voltage deviation 
 at the next time step. The learning rate is set to be 0.001 and the training duration is 100 epochs. An early stopping mechanism is incorporated with a patience value of 20.
The loss function achieves a balanced optimization between coverage and interval width through four core parameters. Among them, the softening coefficient  of 5, PINC of the target prediction interval is 0.9, penalty coefficients λ1 = 50 and λ2 = 0.2. As a result, the loss function not only strictly constrains the lower bound of the prediction interval coverage but also minimizes the interval width, accomplishing the prediction objective of high coverage and narrow width.
Figure 9 present the prediction results of each model for 24 consecutive sampling points. In the figures, the solid blue/red lines represent the true values of the maximum positive/negative voltage deviations, while the blue/red dashed lines indicate the upper and lower prediction bounds for the maximum positive/negative voltage deviations. The safe threshold for voltage deviation in the 10 kV distribution network is set at ±0.07 pu, denoted by gray dashed lines.
 As observed in 
Figure 9d, the Transformer-TCN model fails to cover the true values at sampling point 16, whereas the prediction bounds of the proposed model fully encapsulate the true values in 
Figure 9a. From the figures, there exists an extremely high risk of undervoltage violation at sampling point 20. However, the upper prediction bound for the maximum voltage negative deviation of the LSTM model remains within the safe voltage range in 
Figure 9b, thereby reducing its accuracy in assessing the undervoltage risk. Additionally, the proposed model exhibits narrower prediction intervals overall compared to the Transformer model in 
Figure 9c and the TCN model in 
Figure 9e. Therefore, as shown in the figures, the proposed TCN-Transformer model has certain advantages in interval coverage and interval width.
By analyzing the output prediction intervals at each sampling point, i.e., determining whether the upper and lower bounds exceed the safe voltage deviation threshold, the risk of overvoltage or undervoltage violations can be effectively assessed. As shown in 
Figure 9a, an extremely high risk of undervoltage violation is observed at sampling point 10, accompanied by a moderate risk of overvoltage violation. At sampling point 13, a moderate overvoltage violation risk is identified, while the risk of undervoltage violation remains very low. At sampling point 22, both overvoltage and undervoltage violation risks are assessed to be very low.
In order to comprehensively evaluate the interval prediction performance, 
Table 4 presents a comparison of the performance indicators of all models on the entire test set. 
Figure 10 visualizes the comparison results of the above performance indicators. The results indicate that the proposed model and the comparative models have achieved the specified PINC. Meanwhile, the proposed model yields PINAW
1 = 0.053 and PINAW
2 = 0.069, which represent the narrowest interval widths among all outputs. This indicates that the proposed interval prediction model provides more precise prediction intervals while meeting the accuracy requirements for voltage violation risk assessment.
  5. Conclusions
To address the challenge of risk assessment for limit violations in active distribution networks, this study has undertaken a two-pronged approach. On one hand, a comprehensive evaluation index system has been proposed, incorporating the maximum positive/negative voltage deviations, and a composite power flow violation risk indicator, based on an in-depth analysis of the spatiotemporal correlations inherent in power flow distribution and voltage characteristics. On the other hand, a TCN-Transformer based risk assessment model has been developed to effectively capture the influence of various features on distribution network violation risks. Furthermore, by modifying the network loss function, the model achieves accurate point predictions for power flow violation indicator and interval forecasts for extreme voltage deviations.
Case study results demonstrate the significant advantages of the proposed method in assessment accuracy. It retains the interpretability of physical mechanisms characteristic of traditional methods, thereby aligning with the physical principles governing distribution network operation. Simultaneously, it fully leverages the feature extraction and self-learning capabilities of the TCN-Transformer deep neural architecture to effectively capture spatiotemporally correlated characteristics and improve assessment precision. This work provides a new pathway for the practical application of artificial intelligence technology in the field of distribution network violation risk assessment.
However, enhancing the model’s generalization capability across distribution systems with varying topological structures and developing adaptive optimization strategies remain important challenges. Furthermore, as the frequency support capability of the main grid declines in the future, frequency violation risk in distribution networks will become an important research direction. Future research will focus on improving the robustness of the model and exploring frequency stability issues so as to offer more accurate technical support for risk early-warning and control decision-making in active distribution networks.