Utilising Smart-Meter Harmonic Data for Low-Voltage Network Topology Identification

Othman, Ali; Watson, Neville R.; Lapthorn, Andrew; Mukhedkar, Radnya

doi:10.3390/en18133333

Open AccessArticle

Utilising Smart-Meter Harmonic Data for Low-Voltage Network Topology Identification

by

Ali Othman

^1,*

,

Neville R. Watson

^1,*

,

Andrew Lapthorn

¹

and

Radnya Mukhedkar

²

¹

Electrical & Computer Engineering Department, University of Canterbury, Christchurch 8041, New Zealand

²

EPECentre, University of Canterbury, Christchurch 8041, New Zealand

^*

Authors to whom correspondence should be addressed.

Energies 2025, 18(13), 3333; https://doi.org/10.3390/en18133333

Submission received: 14 May 2025 / Revised: 12 June 2025 / Accepted: 12 June 2025 / Published: 25 June 2025

Download

Browse Figures

Versions Notes

Abstract

Identifying the topology of low-voltage (LV) networks is becoming increasingly important. Having precise and accurate topology information is crucial for future network operations and network modelling. Topology identification approaches based on smart-meter data typically rely on Root Mean Square (RMS) voltage, current, and power measurements, which are limited in accuracy due to factors such as time resolution, measurement intervals, and instrument errors. This paper presents a novel methodology for identifying distribution network topologies through the utilisation of smart-meter harmonic data. The methodology introduces, for the first time, the application of voltage Total Harmonic Distortion (THD) and individual harmonic components (

V_{2}

–

V_{20}

) as topology identifiers. The proposed approach leverages the unique properties of harmonic distortion to improve the accuracy of topology identification. This paper first analyses the influential factors affecting topology identification, establishing that harmonic distortion propagation patterns offer superior discrimination compared to RMS voltage. Through systematic investigation, the findings demonstrate the potential of harmonic-based analysis as a more effective alternative for topology identification in modern power distribution systems.

Keywords:

low-voltage networks; network topology identification; smart-meter; harmonics

1. Introduction

The distribution network constitutes a crucial component of the power grid, ensuring stable and efficient power delivery. Identifying the topology of LV networks is a prerequisite for developing accurate simulation models, which are vital for a cleaner, more efficient future grid. Accurate topology identification leads to precise models that extend the capabilities of existing networks, guide utility investments, and inspire technological developments to meet increasing electricity demands.

Smart-meter data can provide valuable insights into the network by capturing voltage, current, power, and harmonic information. Such data has been utilised in various studies for load profiling [1], demand response programmes [2], grid optimisation [3], fault detection and diagnostics [4], voltage prediction [5], and power quality monitoring [6]. However, relatively few studies have explored the use of harmonics to provide insights into the network, such as mapping the LV network, identifying impedance, and analysing faults such as high impedance connections [7].

Research has been conducted regarding the deployment of smart-meters and how their data can be used in modelling and operating the electrical distribution system. In [8], the authors investigated the data provided by smart-meters to support plans for setting new distribution network configurations; however, the heuristic approach used cannot guarantee globally optimal solutions. In contrast, other studies explored issues related to the availability and synchronisation of smart-meter data [9,10,11]. In [9], the researchers achieved network observability by extracting a subset of the total node data, while their analysis assumed idealised conditions regarding measurement errors and communication synchronisation. Other researchers have discussed how to limit pulling massive real-time data of smart-meters by developing an intelligent algorithm for efficiently collecting data for outage detection and mapping [12].

Electrical utilities are increasingly interested in finding alternative solutions to determine distribution network topologies using smart-meter data. Software-based methods that rely on available smart-meter data can be implemented relatively quickly and at a low cost. Some research has shown promising results in modelling the network using these methods, but they provide only an approximation of the real network topology. In [13], a methodology was developed relying on principal component analysis and its graph-theoretic interpretation to determine the distribution system topology. However, the authors assumed that all energy readings from the smart-meters are available, which is difficult to achieve in actual networks. In [14], Pearson correlation was used to cluster customers’ voltage profiles in the same phases. This research did not consider the cases of asynchronous data and the effect of the impedance of the customers’ connections.

Recursive grouping algorithms have emerged as a powerful tool for topology estimation. Park et al. [15] presented a method that can recover the full network structure using only smart-meter data from leaf nodes. The method analyses injection statistics at end users to infer the topology. This approach was extended by Pengwah et al. [16] to handle scenarios with partial observability. In earlier research, Pengwah et al. [17] utilised smart-meter data to quantify voltage sensitivity coefficients in response to fluctuations in load currents.

In [18], a method combining t-distributed stochastic neighbour embedding (t-SNE) with density-based spatial clustering of applications with noise (DBSCAN) clustering was developed to identify LV distribution network topology using noisy smart-meter data. A two-stage method, combining linear power flow modelling with adaptive ridge regression was developed in [19] to jointly identify distribution network topology and line parameters using smart meter data. In [20], the authors present a wavelet-based topology identification method utilising only energy measurements from smart meters, demonstrating high accuracy whilst requiring minimal data. However, the method’s performance depends on customers exhibiting distinctive consumption patterns and shows accuracy degradation under low network observability or high renewable energy penetration.

A latent tree model approach was developed in [21] for identifying low-voltage distribution grid topology using only end-user smart meter voltage data, employing Bayesian information criterion and expectation-maximisation algorithms to handle unmeasured intermediate nodes through a three-stage search methodology. However, the method relies on voltage correlation assumptions and requires sufficient voltage variation patterns for accurate latent node identification. The research presented in [22] describes a two-stage topology identification framework using a modified expectation-maximisation algorithm called split-EM for historical data analysis, followed by machine learning classifiers for real-time prediction, capable of handling mixed topologies without prior knowledge of topology. However, the method requires substantial historical data for effective classifier training.

A correlation-based algorithm, enhanced with Fisher’s Z-transform, was developed to infer the partial topology of LV distribution networks, as reported in [7]. This study focused solely on identifying transformer/phase mappings and did not address the topological connectivity between Installation Control Points (ICPs). In this research, the authors used voltage THD correlation instead of voltage correlation to make the algorithm more robust and resilient against noise and missing records. Their findings indicated that harmonic data provided more reliable results than voltage and energy measurements. Building upon their work, the proposed research extends the methodology by employing individual harmonic components.

Previous work on the identification of distribution network topology used the estimation of the network voltage sensitivity matrix [23], but it required synchronised voltage and energy data. A low-complexity algorithm was developed in [24] to identify the network operational structure relying on the voltage magnitudes and voltage phases, but this research discussed only radial networks. In [25], the authors proposed an algorithm to identify network topologies as a probable graphic model by regularised linear regression. This algorithm can model meshed networks with the integration of Distributed Energy Resources (DERs). In [26], the authors discussed how to identify network topologies with limited measurements by approximating these measurements as normally distributed random variables and by using the Maximum Likelihood Principle. The study only considers scenarios where a single breaker changes status at a time; extending the approach to handle multiple simultaneous breaker changes would result in an exponential increase in computational complexity.

This paper provides insights into the factors that influence the accuracy of topology identification algorithms. Furthermore, it introduces a novel approach to identifying the topology of an LV distribution network using smart-meter harmonic measurements. The proposed method demonstrates that utilising harmonic information can yield more precise results compared to voltage-based approaches.

The paper is structured as follows: Section 2 discusses the factors affecting topology identification algorithms; Section 3 describes the methodology for generating synthetic harmonic measurements; Section 4 proposes the methodology for identifying the LV distribution network topology using harmonics; Section 5 presents and discusses the results; and Section 6 provides a conclusion.

2. Influential Factors on Topology Identification Algorithm

This section provides a comprehensive discussion of the factors that may affect the accuracy of the topology identification algorithm. The discussion focuses solely on algorithms that utilise voltage correlation. Furthermore, the discussion highlights the differences in correlation between harmonic measurements and RMS voltage measurements. Figure 1 illustrates how these factors can be grouped and linked.

2.1. Upstream Voltage Variability

The variability in upstream voltage, typically in the Medium-Voltage (MV) (usually 11 kV to 66 kV), influences the correlation among different ICPs. This correlation is shaped by various upstream events, including transformer automatic tap changes, network switching, and various network operations, all of which have wide downstream voltage implications. Consequently, these events strengthen the correlation between ICPs associated with different phases and LV networks, effectively decreasing the accuracy of the topology identification algorithm.

The harmonic behaviour of a network is complex, with each non-linear device interacting with its terminal conditions. The sensitivity of harmonic current emission is a function of the device’s component parameters and controls.

The harmonic voltage distortion at the various ICP results from two sources: harmonic currents injected into the LV network, and harmonic distortion in the upstream network (sometimes modelled as a background harmonic voltage source). This harmonic voltage distortion is a function of the injected harmonic currents and the network’s system admittance matrices, which embody the harmonic impedances of all components and hence resonances.

Although the tap changer operation of transformers does alter the transformer impedance and hence system admittance matrices, the primary effect is on the fundamental voltage, as the dX/dtap is relatively small. However, the presence of upstream capacitors that have been switched IN or OUT can change the system admittance matrices significantly, and hence the resonance point. This will be reflected in a significantly different background harmonic level as seen from the supply point of the LV network.

Upstream capacitor switching will influence the voltage distortion levels at the ICPs, but not the correlation. Similarly, the harmonic contribution from upstream non-linear industrial loads and/or distributed energy resources does influence the harmonic voltage distortion at the ICPs. However, the harmonic voltage distortion seen at the ICP is a combination of the upstream distortion influence (MV distortion × Transfer Coefficient) and the harmonic voltage across the branches of the LV system. It is this latter component that allows identification to occur.

The frequency-dependent nature of the inductive branch impedance (Z = R + jX), and also the capacitance present in the network, means that higher-order harmonics encounter a higher series branch impedance. This leads to a more localised observable impact of the higher-order harmonic currents on the harmonic voltage distortion.

In this paper, background harmonic distortion from upstream sources was explicitly represented in the network modelling to ensure these effects were captured and to ensure the robustness of the identification algorithm to upstream harmonic sources.

2.2. Load Characteristics

The ICP’s voltage is directly affected by the real and reactive power consumed by the ICP. This occurs because the power is approximately proportional to the drawn current, resulting in increased voltage drop along the line. Equipment/loads used for heating and cooling typically have a higher power consumption than many other loads. However, such higher power loads may be absent during seasons with mild temperatures that do not require adjustments, or in areas with poor economic conditions where electricity is unaffordable. Moreover, these higher power loads may be missing when households rely on alternative energy sources such as gas, wood pellets, or coal for heating purposes.

A relatively high consumption is crucial for establishing a strong correlation between the voltages at different ICPs, thereby enhancing the accuracy of identifying network topology.

The introduction of DERs has the potential to significantly impact voltage fluctuations within distribution networks. This can pose challenges for traditional techniques that rely on voltage correlation for network topology identification. Moreover, the integration of rooftop solar panels, with their increasing penetration, further exacerbates the issue of phase identification within the network. When many solar systems in the same area generate power at the same time, they cause similar voltage changes across all phases, making phase identification even more difficult.

Conversely, harmonics do not directly depend on the load power but rather on the load type. In the presence of increased non-linear loads, harmonics become more apparent in the LV network, enabling a robust harmonic correlation between ICPs. Moreover, the rise in DERs and electric vehicles will also contribute to an increase in harmonics in the LV network. Consequently, this allows for more accurate results in identifying network topology based on harmonic measurements.

Energy theft can alter the consumption trend of ICPs, although the overall energy consumption will increase. Energy theft can increase the voltage drop, and it might provide more distinguished features to the phases where the theft is occurring, resulting in more accurate topology identification results.

Balanced three-phase loads, or phase-to-phase connected loads (that are connected between phases rather than phase-to-neutral), will affect the same phases with a similar trend of voltage variation. This may lead to the introduction of correlations between phases that do not reflect their topological connections.

2.3. Load Demand Response

The demand response of the ICP may be synchronised in a manner that complicates the network identification process. This occurrence can be outlined as follows:

Some utilities introduce programmes to incentivise consumers to reduce or shift their electricity usage during peak demand periods. Moreover, certain companies offer programmes such as “free hours” power plans. These initiatives can result in increased or decreased demand during specific times, leading to voltage variations that exhibit similar trends across wider geographical areas.
The implementation of ripple control for hot water cylinders can adjust the load across the entire network in a synchronised manner. This adjustment strengthens the correlation between different phases, potentially causing misidentification of the network structure. The broader the coverage of ripple control across an area, the more pronounced its effects become. This phenomenon occurs because its impact permeates through both LV networks and is compounded by similar LV demand control measures in adjacent networks.
The Vehicle-to-Grid (V2G) technology presents another challenge. Although V2G technologies remain in the early developmental stages, the potential for synchronised control by third parties may emerge in future implementations. When such capabilities materialise, the coordinated utilisation of V2G infrastructure on a large scale could substantially increase the correlation between different phases and networks within the system, thereby complicating the network identification processes.

Overall, the introduction of DERs, dynamic load demand, and emerging technologies such as V2G have profound implications for network topology identification and phase recognition in distribution systems. Consequently, there is a pressing need for the development of novel methodologies and algorithms to effectively address these challenges in contemporary distribution network management.

2.4. Network Structures

The structure of a network can significantly impact the correlation matrix between different ICPs. For example, in networks with radial and long feeders, the correlations between ICPs are more distinct. Conversely, mesh networks might make these correlations less clear, potentially affecting the performance of the network identification algorithms.

The extensive size of the network and the high number of ICPs typically result in a more distinct correlation matrix. However, a higher number of ICPs can increase the probability of exhibiting misleading strong correlations between ICPs that lack physical linkage.

The voltages of the ICPs are directly influenced by the impedance of the network. However, different harmonic orders perceive the network impedance differently, particularly in terms of reactance. Higher-order harmonics perceive the network impedance as a higher value, leading to increased harmonic voltage distortion and more distinct values for each phase in the network. Consequently, it is expected that voltage harmonic distortion will provide better performance in topology identification, especially in networks with relatively lower line impedance.

2.4.1. Capacitor Banks and Voltage Regulators

The operation of capacitor banks and voltage regulators in LV networks can obscure natural voltage fluctuations, reducing the ability to distinguish correlations between interconnected nodes within the network. This occurs because these devices maintain voltage levels within specified limits, thereby modifying the inherent voltage variability across the network.

2.4.2. Poor Neutral Conductor Connection

If these issues remain undetected, they can cause unexpected voltage behaviours on individual phases. This may result in certain phases experiencing overvoltages or undervoltages, which can lead to strengthening the correlation between ICPs that are not necessarily topologically connected.

2.5. Consideration of Measurement Characteristics

2.5.1. Data Synchronisation

Ensuring accurate correlations between corresponding phases relies heavily on effective data synchronisation. Nonetheless, the significance of this synchronisation diminishes as data time resolution is prolonged. It is worth noting that selecting extended time resolution may compromise the accuracy of correlation results. In general, this occurs because the unique features of voltage fluctuations may begin to fade, leading to less precise outcomes. Furthermore, synchronisation issues could become more prominent for the instantaneous voltage readings from smart-meters, or, in some cases, the minimum and maximum values of specific periods. For harmonic measurements, synchronisation presents less critical concern as these typically employ standard window times. According to IEC 61000-4-30 [27], the standard window time for harmonic measurements is typically 10/12 cycles for 50/60 Hz systems, which corresponds to a 200 ms window.

2.5.2. Metering Errors

Smart-meter errors can generally be categorised into two types:

Systematic Errors: Systematic errors are constant and may arise from various issues, such as faulty internal components or incorrect calibration. To address these errors, specific correlation techniques, such as Pearson correlation, can be employed. Pearson correlation is less sensitive to constant errors and instead focuses on capturing the strength and direction of the linear relationship between data sets.

Random Errors: Random errors associated with smart-meters are typically unpredictable and may arise from fluctuations in measured data due to various factors, such as a malfunctioning smart-meter. These errors can lead to outlier data, potentially impacting the performance of correlation analyses. However, the significance of these random errors diminishes when they occur continuously over an extended period, with resampling characteristics similar to those found in systematic errors.

2.6. Methods for Analysing Voltage Measurements

2.6.1. Correlation Techniques

Correlation and clustering techniques represent the most common analytical methods for identifying network topologies. However, depending on the data characteristics, certain correlation techniques prove more suitable than others. For example, Pearson correlation is appropriate for assessing a linear relationship. Spearman’s Rank and Kendall’s Tau correlations are more suitable for assessing a non-linear relationship.

2.6.2. Enhancing the Precision of Topology Identification by Integrating the Locational Information of ICPs

The accuracy of topology identification can be greatly enhanced by integrating the locational information of ICPs, when available. This integration enables the identification of ICPs that are more likely to be physically linked. These locations are then compared with their correlation values, as some may exhibit strong correlations but lack a physical link, possibly from a different network. Such situations can arise when processing a large number of ICPs within a relatively short measurement period. Therefore, the algorithm should incorporate a location-based approach rather than solely relying on measurement results, which can sometimes be misleading. This enhances the precision of correlation identification and, consequently, contributes to the construction of more accurate network topologies.

3. Methodology for Generating Synthetic Harmonic Measurements

Obtaining actual harmonic data remains a challenge in the distribution sector compared to readily available power and voltage data. The industry may not perceive significant benefits from these measurements, especially since there are often no regulations on the harmonic emissions from domestic ICPs. Presently, harmonics are undeniably starting to cause issues at the distribution level, and they are gaining more attention from industry bodies.

For this research, synthetic harmonic measurements were generated to be utilised in testing the proposed algorithm for identifying the distribution network topology. This synthetic generated data provides flexibility, scability, and full network observability. The methodology used for generating synthetic harmonic and voltage measurments was presented in [28]. This methodology comprises three stages, as illustrated in Algorithm 1. This approach combines data from the CREST demand model [29] and PANDA (equiPment hArmoNic DAtabase) [30] to create detailed profiles, including; active power, reactive power, and harmonic emissions. The process entails generating initial real power load profiles, assigning specific appliances to each dwelling, and subsequently calculating the corresponding reactive power and harmonic emissions. This method produces varied and realistic load profiles that accurately reflect the diversity of household electricity consumption patterns and harmonic emissions across ICPs.

The CREST tool, which was utilised to generate realistic power load profiles, is provided in Excel format. The tool’s Excel VBA script was modified to extract individual appliance load profiles. Table 1 presents the main electrical devices modelled in this study.

Algorithm 1 Generation of Load Profiles and Harmonic Emissions for Residential Dwellings

Require: CREST demand model, PANDA database

DB

, Number of dwellings N

1:: Stage 1: Initial Active Load Demand Generation
2:: Initialisation of CREST demand model
3:: Parameter configuration $θ \in Θ$ for CREST model
4:: $P_{p} \leftarrow CREST (θ, N) \in R^{N \times T}$ ▹ T denotes time interval
5:: Stage 2: Appliance Selection and Assignment
6:: $A \leftarrow SelectAppliances (DB)$
7:: for $i \leftarrow 1$ to N do
8:: $S_{i} \leftarrow CreateApplianceSet (A)$ with $5 %$ variance
9:: $D_{i} \leftarrow AssignAppliances (S_{i}, P_{p} [i, :])$
10:: end for
11:: $D \leftarrow {D_{1}, \dots, D_{N}}$
12:: Stage 3: Load Profile Refinement and Harmonic Emission Generation
13:: $P_{p}^{'} \leftarrow f (D) \in R^{N \times T}$ ▹ Refined active load profiles
14:: $P_{q} \leftarrow g (D) \in R^{N \times T}$ ▹ Generated reactive load profiles
15:: $H \leftarrow h (D) \in C^{N \times H \times T}$ ▹ Generated harmonic emissions
16:: Output:
17:: $P_{p}^{'} \in R^{N \times T}$ : Refined active power demand profiles
18:: $P_{q} \in R^{N \times T}$ : Reactive power demand profiles
19:: $H \in C^{N \times H \times T}$ : Harmonic current emissions
20:: $D = {D_{1}, \dots, D_{N}}$ : Set of device assignments per dwelling

Harmonic measurements for each appliance were obtained from the PANDA. Multiple actual measurements for household appliances with different types and ratings were selected. The measurements for the appliances were selected randomly, without considering the supply type (whether from a pure sinusoidal source or the main grid), as no clear relationship exists between these conditions and the appliances’ harmonic emissions [31]. Therefore, for each dwelling, specific PANDA appliances measurements were created by adjusting the original data with a uniform random variable, as shown in Equation (1).

M = M_{PANDA} \times α, α \in R : 0.95 \leq α \leq 1.05

(1)

The random value

α

alters the actual harmonic measurements of the PANDA database by ±5% to vary the measurements around the recorded level. This approach creates a unique database of appliances for each house, introducing more realistic variations in harmonic emissions. The harmonic emissions are influenced by terminal voltage waveform and exhibit variation around its recorded operating point.

Following the establishment of a unique database of appliances for each house, power consumption and harmonic emissions were aggregated to create comprehensive load profiles. These profiles were subsequently assigned to the respective houses within the network, as shown in Figure 2. Each load profile encompasses two primary components:

Power characteristics, including both active power and reactive power components, with loads modelled as constant power elements.
Harmonic emission profiles, selected based on the corresponding power profiles, with emissions modelled as direct current harmonic injection sources.

The results of the power-flow analysis and harmonic analysis were simulated using DIgSILENT PowerFactory 2024 (x64). As quasi-dynamic simulation for harmonic analysis is not available in PowerFactory, an automation script utilising Python 3.10.8 was developed to obtain these results.

The voltage and THD measurements for different nodes are shown in Figure 3 and Figure 4. The figures show that ICPs with the same phases exhibit similar voltage trends, leading to strong correlation between them. However, the correlation is relatively weaker for different phases, indicating that they are located on different phases.

4. Proposed Algorithm for Identifying the Network Topology

The proposed methodology leverages harmonic voltage correlation patterns between ICPs to reconstruct the network topology. This approach is based on the fundamental principle that voltage variations at nodes within the same electrical phase exhibit stronger correlations compared to nodes on different phases or networks.

The primary objective of the proposed algorithm is to accurately identify the topology of LV distribution networks. Unlike conventional methods that depend on current measurements or power consumption data from ICPs, this approach relies solely on voltage-based metrics, including

V_{rms}

,

V_{2}

–

V_{20}

, and

THD

. By analysing these voltage correlations, the method effectively maps the underlying network structure.

This section introduces the methodology in detail, highlighting its novel aspects. Specifically, the proposed approach innovatively employs THD and individual harmonic voltage components (

V_{2}

–

V_{20}

) as topology indicators. Furthermore, it enhances the classical MST-KRUSKAL algorithm, modifying it to suit the specific characteristics of electrical distribution network topologies.

Figure 5 presents a high-level overview of the proposed algorithm for identifying the distribution network topology, while Algorithm 2 provides a more detailed description of the methodology, which consists of three main stages:

Stage I: Correlation and Distance Matrix Calculation. Development of correlation and distance matrices utilising voltage and harmonic measurement data, as elaborated in Section 4.1.
Stage II: Topology Construction via Modified MST-KRUSKAL. Topology construction through implementation of the modified MST-KRUSKAL algorithm, as elaborated in Section 4.2.
Stage III: Topological Similarity Assessment. Quantitative assessment of topological similarity between the estimated and actual network configurations, as elaborated in Section 4.3.

Algorithm 2 Distribution Network Topology Identification using Modified MST-KRUSKAL

Require:

1:: $M (n, m)$ : Complete measurement space
2:: $V_{rms} \subset M (n, m)$ : RMS voltage measurements at ICPs
3:: $V_{H} = ⋃_{h = 2}^{9} V_{h}$ : Harmonic voltage components
4:: $THD \subset M (n, m)$ : THD measurements
5:: k: Target number of clusters, defined as $k = 3 phases \times (number of networks)$

Ensure: Network topology

G_{est}

6:: Stage I: Correlation and Distance Matrix Calculation
7:: for each measurement type $X \in {V_{rms}, V_{H}, THD}$ do
8:: $R_{X} \leftarrow PearsonCorr (X)$ ▹ Correlation matrix
9:: $D_{X} \leftarrow DistanceMatrix (R_{X})$ ▹ Convert to distances
10:: end for
11:: Stage II: Topology Construction using Modified MST-KRUSKAL
12:: $G_{est} \leftarrow CreateGraph (V)$ ▹ Initialise with $V$ nodes
13:: $E \leftarrow SortedEdges (D_{X})$ ▹ Sort by weighted distances
14:: for $(i, j) \in E$ do
15:: if $\neg HasPath (G_{est}, i, j)$ then
16:: $G_{est} \leftarrow AddEdge (G_{est}, i, j)$
17:: end if
18:: if $NumberComponents (G_{est}) \leq k$ then
19:: break
20:: end if
21:: end for
22:: Stage III: Topological Similarity Assessment
23:: $S, E_{inter}, E_{intra} \leftarrow AssessTopology (G_{act}, G_{est})$ ▹ Calculate similarity metrics
24:: return $G_{est}, S, E_{inter}, E_{intra}$

4.1. Distance Matrix Calculation Using Pearson Correlation

A crucial step in hierarchical clustering is the construction of the distance matrix, which quantifies the dissimilarity between observations. As measurement data comprise voltage and harmonics, particularly in this case where the focus is on the relationship between variables rather than their absolute values, the Pearson correlation coefficient provides a natural basis for calculating distances. The process involves several steps:

First, the Pearson correlation coefficient

ρ_{x y}

between two observations x and y is calculated as follows:

ρ_{x, y} = \frac{cov (x, y)}{σ_{x} σ_{y}}

(2)

where

cov is the covariance;
$σ_{x}$ is the standard deviation of x;
$σ_{y}$ is the standard deviation of y.

Let

R \in R^{n \times n}

be the correlation matrix of n variables, where each element

ρ_{i j}

represents the Pearson correlation coefficient between the i-th and j-th variables. The matrix is symmetric, with diagonal elements equal to 1 (

ρ_{i i} = 1

) and off-diagonal elements satisfying

- 1 \leq ρ_{i j} \leq 1

, for all

i, j \in {1, \dots, n}

.

R = (\begin{matrix} 1 & ρ_{12} & ρ_{13} & \dots & ρ_{1 n} \\ ρ_{21} & 1 & ρ_{23} & \dots & ρ_{2 n} \\ ρ_{31} & ρ_{32} & 1 & \dots & ρ_{3 n} \\ ⋮ & ⋮ & ⋮ & ⋱ & ⋮ \\ ρ_{n 1} & ρ_{n 2} & ρ_{n 3} & \dots & 1 \end{matrix}) .

(3)

To use the correlation matrix

R

for clustering, it is transformed into a distance matrix

D \in R^{n \times n}

using the following transformation:

d_{i j} = 1 - max (0, ρ_{i j}), i, j \in {1, \dots, n} .

(4)

This transformation ensures non-negative distances, where

d_{i j} = 0

for perfectly positively correlated variables (

ρ_{i j} = 1

) and

d_{i j} = 1

for zero or negatively correlated variables (

ρ_{i j} \leq 0

). The resulting distance matrix is as follows:

D = (\begin{matrix} 0 & d_{12} & d_{13} & \dots & d_{1 n} \\ d_{21} & 0 & d_{23} & \dots & d_{2 n} \\ d_{31} & d_{32} & 0 & \dots & d_{3 n} \\ ⋮ & ⋮ & ⋮ & ⋱ & ⋮ \\ d_{n 1} & d_{n 2} & d_{n 3} & \dots & 0 \end{matrix}) .

(5)

Each element

d_{i j}

of

D

is derived from the corresponding correlation coefficient

ρ_{i j}

, ensuring compatibility with clustering algorithms by representing dissimilarity between variables. Note that the diagonal elements are now 0.

4.2. Modified Kruskal’s Minimum Spanning Tree Algorithm

Kruskal’s algorithm solves the minimum spanning tree (MST) problem using a greedy approach [32]. Given a connected, undirected graph

G = (V, E)

with a weight function

D

, the algorithm builds a minimum spanning tree by iteratively selecting the lowest-weight edge that connects two distinct components. It begins with each vertex as its own tree and progressively merges these trees until a single tree spans all vertices.

Unlike the standard MST approach, wherein the process continues until a single tree spans all vertices, the proposed modified MST-KRUSKAL algorithm is designed to construct multiple spanning trees based on predefined network constraints. Rather than merging all components into a single tree, the algorithm ensures that precisely K trees are formed, where k corresponds to distinct network clusters. For instance, in a LV distribution network, each phase (A, B, and C) is treated as a separate cluster, ensuring that connections respect the underlying phase structure. This modification enables better alignment with practical electrical distribution constraints whilst preserving the efficiency of the original Kruskal’s approach.

The proposed Modified MST-KRUSKAL algorithm is formally defined in Algorithm 3 and it employs the following principal functions:

CreateGraph( $V$ ): Initialises a graph structure with the vertex set $V$ , representing the collection of network nodes inclusive of ICPs.
SortedEdges( $D$ ): Executes an ascending sort operation on the distance matrix $D$ , which encodes the edge weights of the graph.
HasPath( $G$ , i, j): Ascertains the existence of a path between nodes i and j in graph $G$ , thereby ensuring the avoidance of cyclic paths in the resultant structure.
AddEdge( $G$ , i, j): Incorporates a new edge into graph $G$ connecting vertices i and j, thus expanding the network topology.
NumberComponents( $G$ ): Determines the present quantity of connected components (trees) formed within graph $G$ , providing a measure of network segmentation.

Algorithm 3 Modified MST-KRUSKAL(

G

, D

)

Require:

1:: $D$ : Distance Matrix
2:: k: Target number of clusters (three phases × number of networks)

Ensure:

3:: $G \leftarrow CreateGraph (V)$ ▹ Initialise with $V$ nodes
4:: $E \leftarrow SortedEdges (D)$ ▹ Sort by weighted distances (ascending order)
5:: for $(i, j) \in E$ do
6:: if $\neg HasPath (G, i, j)$ then
7:: $G \leftarrow AddEdge (G, i, j)$
8:: end if
9:: if $NumberComponents (G) \leq k$ then
10:: break
11:: end if
12:: end for
13:: return $G$

The algorithm initialises by creating a singleton set for each ICP. It then examines edges in sequence, checking if they would form cyclic paths. Edges that do not create cycles are added to the forest, connecting different vertices. The algorithm continues this process while monitoring the number of trees. Initially, the number of trees equals the number of vertices (

V

), as each vertex represents an individual tree. This number decreases as trees are merged, and the algorithm terminates when the number of trees reaches the target value k.

4.3. Topological Similarity in Electrical Distribution Networks

Electrical distribution networks possess distinct characteristics that make traditional graph comparison methods insufficient for meaningful topology comparison. The fundamental challenge in comparing electrical distribution network topologies lies in recognising functional configurations and understanding the impact of incorrect topologies.

For comparing network topologies, let

G

be a network topology with its edges and vertices in formal mathematical notation:

G = (V, E) where

(6)

$V = {v_{1}, v_{2}, \dots, v_{n}}$ is the set of vertices;
$E = {(v_{i}, v_{j}) : v_{i}, v_{j} \in V} = {e_{1}, e_{2}, \dots, e_{m}}$ is the set of edges.

Let

G_{act}

represent the actual topology, and

G_{est}

represent the estimated topology. The estimated and actual graphs have the same vertex set

V

but may have different edges

E

, or the same edges in the case of identical topologies.

To identify missing edges, the comparison algorithm examines each edge

e \in E_{act}

within the estimated topology

G_{est}

. For every edge, vertices

u_{i}

and

u_{j}

are identified where

e = (u_{i}, u_{j})

, and the minimum edge count

d_{est} (u_{i}, u_{j})

in

G_{est}

is calculated. The minimum edge count function

d_{est} (u_{i}, u_{j})

measures the smallest number of edges between vertices

u_{i}

and

u_{j}

in

G_{est}

, providing a quantitative measure of topological differences that accounts for both local and global variations in network structure. The results from this comparison indicate the number of missing edges in

G_{est}

compared with

E_{act}

.

Similarly, incorrect edges are defined by examining each edge

e \in E_{est}

within the actual topology

G_{act}

. In this framework, an edge is classified as incorrect or missing when the minimum edge count between vertices

u_{i}

and

u_{j}

(

d (u_{i}, u_{j})

)

> 2

. This approximation yields more consistent results by providing relaxation for minor topology errors where minimum edge count between two vertices is small.

The similarity metric for topology comparison is defined as one minus the proportion of incorrect edges relative to the total number of edges. The metric S ranges from 0 to 1, where 0 indicates no similarity (all edges are incorrect) and 1 indicates perfect similarity (no incorrect or missing edges):

S = 1 - \frac{incorrect edges + missing edges}{total number of edges}

(7)

The relationship between topological similarity and functional impact presents interesting dynamics in these networks. Two networks may present similar topological structures yet have significant impacts on the electrical distribution network; for example, edges connecting two different phases in the network. Conversely, some dissimilarities between two network topologies may have negligible impacts, such as edges that connect two close nodes in the same phases which are very close to each other. This equivalence stems from the physical principles that govern electrical distribution networks. When assessing topology accuracy, edge anomalies can also be categorised as follows:

$E_{intra}$ : edges that incorrectly connect vertices (ICPs) within the same phase and LV network.
$E_{inter}$ : edges that incorrectly link vertices (ICPs) across different phases or networks.

These two types of incorrect connections are shown in Figure 6.

Obtaining the type of incorrect edges provides a comprehensive assessment of topology accuracy, as these two types of incorrect connections have distinct impacts on the network. In particular, edges that incorrectly link ICPs across different phases and networks can significantly affect the overall network topology.

5. Results and Discussion

5.1. Initial Visualised Results for Topology Identification Algorithm Performance

This section evaluates the performance of the proposed topology identification methodology using two measurements:

V_{r m s}

and

THD

. The results are visualised for three LV distribution networks (denoted as S, N, and E), each fed by a distinct transformer connected to an MV network. Estimated topology connections are depicted as coloured lines (red, blue, and green), corresponding to the three-phase structure, whilst actual connections are shown as grey dotted lines for comparative analysis.

Figure 7 illustrates the topology derived from

V_{r m s}

measurements, while the

V_{r m s}

-based results capture the general radial structure of the actual network, several incorrect links are observed. These inaccuracies can be categorised as follows:

$E_{intra}$ : Connections between nodes of the same phase and network that deviate from the actual network. For example, the Blue line (phase C) incorrectly links node 2S to 61S in network S, whereas the correct connection should link 2S to either node 42S or 55S.
$E_{inter}$ : Spurious links across different phases or networks. A prominent example is the blue line (phase C) connecting node 45N to 14S, which violates the radial hierarchy. Similarly, the edge between node 33E (phase B) and 53S (phase A) misrepresents actual connections.

In contrast, Figure 8, based on

THD

measurements, demonstrates superior accuracy in resolving phase-specific connections and minimising ambiguous links. The enhanced performance stems from

THD

’s sensitivity to harmonic propagation patterns, which are inherently tied to network impedance and topology. This allows finer discrimination of ICPs’ phases and networks. For instance, the THD-based method correctly isolates phase C (blue) in network N, avoiding the cross-network errors seen in the

V_{r m s}

results.

The colour-coded phase identification in both figures aligns with the actual topology to varying degrees of accuracy. Whilst the

V_{r m s}

-based approach shows alignment in simpler radial branches, the

THD

-based method achieves near-perfect correspondence with actual connections. This robustness arises from the unique harmonic signatures at each node. These serve as discriminative features that enhance topology inference, providing advantages not utilised by

V_{r m s}

-based methods alone.

These findings underscore the advantages of incorporating harmonic distortion metrics into topology identification frameworks. Utility operators can leverage

THD

-driven insights, particularly in systems equipped with smart-meters capable of capturing harmonic data, to enhance network visibility and operational tasks such as phase balancing, network restructuring, fault localisation, and DER integration. The subsequent section delves into quantitative performance metrics to further validate these observations.

5.2. Performance Metrics: Similarity Score

The topology identification results depicted in Figure 9 demonstrate a clear relationship between measurement type and similarity score across various time resolution settings. The results are illustrated through two visual representations: Figure 9a displays a line graph depicting the similarity score as a function of time resolution, while Figure 9b shows a matrix of numerical similarity scores, with each cell colour-coded to form a heatmap. The similarity score is calculated between the actual and estimated topologies (

G_{act}

,

G_{est}

) resulting from the proposed topology identification algorithm. The graph illustrates distinct performance patterns among (i)

V_{rms}

, (ii) various harmonic components, and (iii)

THD

, with notable variations in similarity scores as the time resolution increases. This analysis provides valuable insights into the effectiveness of different measurement types for topology identification purposes.

Figure 9 demonstrates that

THD

measurements exhibit superior accuracy in topology identification across all time resolution settings, with perfect similarity scores for time resolutions of less than 30 min. In contrast, the results based on

V_{rms}

show the lowest performance among the evaluated metrics.

Analysis of lower-order harmonics (

V_{2}

–

V_{6}

) reveals several important characteristics. These components exhibit relatively lower similarity scores, particularly when the time resolution is one minute compared to THD. For instance, the second harmonic (

V_{2}

) achieves a similarity score of only 0.926 at a one-minute resolution. This behaviour can be attributed to the nature of lower-order harmonics, where

X = 2 π f L

yields smaller impedance values. The reduced impedance allows these harmonics to propagate more extensively throughout the network, making them susceptible to cumulative effects from multiple harmonic emission sources. Consequently, more diffuse distortion patterns emerge, complicating the process of topology identification.

In contrast, higher-order harmonics (

V_{13}

–

V_{20}

) demonstrate superior performance in topology identification tasks compared to lower-order harmonics. Almost all of the higher harmonics achieve impressive similarity scores of 1 at a one-minute time resolution, maintaining better accuracy even with extended time resolutions. This enhanced performance stems from the increased impedance these harmonics encounter within the distribution network, leading to more localised effects that better preserve topology-specific characteristics. The stronger correlation between higher-order harmonics in the network makes them particularly valuable for identification applications, as they are less influenced by broader network conditions and load variations.

Overall, the trend shows that higher time resolutions tend to reduce similarity scores across most metrics, although the extent of this reduction varies. This occurs because increasing the time resolution smooths out the variation in measurement profiles and reduces the number of available measurement points. As a result, the similarity score generally decreases as the time resolution increases. A notable observation is the non-monotonic behaviour exhibited by several harmonic components—for example, THD,

V_{2}

, and

V_{8}

—which show local fluctuations in similarity scores. These fluctuations refer to increases or decreases in the similarity score as the time resolution increases. Such variations, observed within the 24 h testing period, may be influenced by the network structure or by inherent changes in harmonic content throughout the daily cycle. The limited duration of the testing period may also contribute to this variability. These findings indicate that the relationship between time resolution and similarity score is not linear or straightforward, particularly when comparing closely spaced time resolutions, such as 45 min and 60 min.

A notable observation is the non-monotonic behaviour exhibited by several harmonic components; for example, as evident in THD,

V_{2}

, and

V_{8}

, which show local fluctuations in similarity scores. These fluctuations refer to increases or decreases in the similarity score as the time resolution increases. Such variations, observed within the 24 h testing period, may be attributed to the network structure or the inherent variability in harmonic content throughout the daily cycle. Furthermore, this variability could be attributed to the limited testing period of one day. This behaviour suggests that the relationship between time resolution and similarity score is more complex than a simple relationship, particularly when comparing relatively close time resolutions (e.g., 45 min versus 60 min). The optimal harmonic order or time resolution for accurately identifying network topology likely depends on multiple factors, including the network structure and its complexity, load characteristics, and the length of the measurement period. Therefore, optimising the time resolution or selecting the appropriate harmonic order for a specific network would require an understanding of that system’s load behaviour, structural configuration, and measurement characteristics.

5.3. Performance Metrics: $E_{intra}$ and $E_{inter}$

Other important metrics to consider are the indices:

E_{intra}

and

E_{inter}

, shown in Figure 10 and Figure 11. As illustrated in Section 4.3,

E_{intra}

represents edges that incorrectly connect ICPs within the same phase and LV network, whilst

E_{inter}

represents edges that incorrectly link ICPs across different phases or networks. The similarity score was calculated based on the total number of incorrect edges. These two types of incorrect edges are assigned equal weights. However, from an electrical perspective,

E_{intra}

may be more tolerable than

E_{inter}

. This is because

E_{inter}

connects different phases or networks, which can lead to errors during power-flow analysis. In contrast,

E_{intra}

does not cause power-flow calculation errors, but it can reduce the accuracy of the results. This is why

E_{intra}

is more acceptable.

For lower time resolution, Figure 10 reveals a distinctive pattern in

E_{inter}

values. These values demonstrate higher magnitudes for lower harmonic orders, whilst decreasing for higher-order harmonics. This trend suggests that lower harmonic orders generate more distributed harmonic distortion throughout the grid, resulting in stronger correlations between electrically distant ICP locations.

In contrast, Figure 11 presents a different behaviour for

E_{intra}

measurements. These values remain relatively constant across all harmonic measurements, including THD, and consistently maintain lower values compared to

E_{inter}

.

As the time resolution increases, both

E_{intra}

and

E_{inter}

metrics exhibit an upward trend. However, these two metrics display distinct behaviours:

$E_{intra}$ shows stronger sensitivity to time resolution, particularly for higher-order harmonics. This occurs because neighbouring nodes inherently share strong correlations and similar harmonic profiles at high frequencies. Prolonged averaging masks subtle nodal distinctions, amplifying misidentification within localised regions.
$E_{inter}$ increases moderately with extended time resolution, but displays relative resilience at higher harmonics. Since these harmonics decay rapidly with distance, remote nodes maintain distinct profiles even under averaging. This inherent dissimilarity limits correlation strengthening between distant ICPs.
Lower-order harmonic behaviour: Both metrics follow comparable growth patterns for harmonics. The diffuse nature of low-frequency oscillations creates network-wide correlation uniformity, reducing differentiation between local and remote misconnection trends.
For THD, $E_{inter}$ remains largely unaffected by the time resolution, indicating a more localised influence. In contrast, this localised effect causes $E_{intra}$ to increase as the time resolution extends, reinforcing misidentification within nearby ICPs.

6. Conclusions

This research proposes a novel methodology for identifying distribution network topology using voltage harmonic measurements. Unlike conventional approaches, it eliminates the need for energy measurements, historical data, or geographical information. The algorithm is based on a three-stage process consisting of harmonic voltage correlation matrices, a modified Kruskal’s minimum spanning tree, and a similarity assessment. Key innovations include the first use of THD and individual harmonics (

V_{2}

–

V_{20}

) as topology identifiers, demonstrating superior accuracy compared with conventional RMS voltage measurements.

THD measurements consistently achieved perfect similarity scores (1.0) for time resolutions below 30 min, significantly outperforming traditional voltage-based approaches. Higher-order harmonics (

V_{13}

–

V_{20}

) demonstrated superior performance compared to lower-order harmonics due to frequency-dependent network impedance. Higher frequencies encounter increased impedance (

Z = R + j X L

), creating more localised effects that better preserve the unique characteristics of each network topology. In contrast, lower-order harmonics exhibit distributed propagation patterns throughout the network, where cumulative effects from multiple emission sources obscure the distinct signatures needed for accurate topology identification.

The error analysis revealed different behaviours for the two error types. Intra-network errors (

E_{intra}

) show stronger sensitivity to time resolution, particularly for higher-order harmonics, as neighbouring nodes naturally share strong correlations and prolonged averaging masks subtle differences between adjacent nodes, leading to increased misidentification within localised regions. Conversely, inter-network errors (

E_{inter}

) increase only moderately with extended time resolution and display relative resilience at higher harmonics, since these frequencies decay rapidly with distance, allowing remote nodes to maintain distinct harmonic profiles even when measurements are averaged over longer periods.

These results highlight the significant potential of leveraging smart-meter harmonic data to enhance distribution network monitoring and operation. The methodology provides utilities with improved network visibility by accurately identifying network topology, which can subsequently support critical operational tasks such as phase balancing, fault localisation, and optimal DER integration. Importantly, this enhanced capability is achieved without requiring substantial additional hardware investments, making it a cost-effective solution for modern distribution system management.

Author Contributions

Conceptualization, A.O. and N.R.W.; Methodology, A.O.; Software, A.O.; Validation, A.O.; Formal analysis, A.O.; Investigation, A.O.; Writing—original draft, A.O.; Writing—review & editing, N.R.W., A.L. and R.M.; Visualization, A.O.; Supervision, N.R.W., A.L. and R.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding authors.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Wang, F.; Lu, X.; Chang, X.; Cao, X.; Yan, S.; Li, K.; Catalão, J.P. Household profile identification for behavioral demand response: A semi-supervised learning approach using smart-meter data. Energy 2022, 238, 121728. [Google Scholar] [CrossRef]
Carmichael, R.; Gross, R.; Hanna, R.; Rhodes, A.; Green, T. The Demand Response Technology Cluster: Accelerating UK residential consumer engagement with time-of-use tariffs, electric vehicles and smart-meters via digital comparison tools. Renew. Sustain. Energy Rev. 2021, 139, 110701. [Google Scholar] [CrossRef]
Ulrich, A.; Baum, S.; Stadler, I.; Hotz, C.; Waffenschmidt, E. Maximising Distribution Grid Utilisation by Optimising E-Car Charging Using smart-meter Gateway Data. Energies 2023, 16, 3790. [Google Scholar] [CrossRef]
Dutta, S.; Sahu, S.K.; Roy, M.; Dutta, S. A data-driven fault detection approach with an ensemble classifier-based smart-meter in modern distribution system. Sustain. Energy Grids Netw. 2023, 34, 101012. [Google Scholar] [CrossRef]
Mokhtar, M.; Robu, V.; Flynn, D.; Higgins, C.; Whyte, J.; Loughran, C.; Fulton, F. Prediction of voltage distribution using deep learning and identified key smart-meter locations. Energy AI 2021, 6, 100103. [Google Scholar] [CrossRef]
Chakraborty, S.; Das, S.; Sidhu, T.; Siva, A.K. Smart-meters for enhancing protection and monitoring functions in emerging distribution systems. Int. J. Electr. Power Energy Syst. 2021, 127, 106626. [Google Scholar] [CrossRef]
Watson, J.D.; Welch, J.; Watson, N.R. Use of Smart-meter data to determine Distribution system topology. IET J. Eng. 2016, 2016, 94–101. [Google Scholar] [CrossRef]
Song, Z.; Yang, Z. New heuristic distribution network re-configuration method for overloading elimination. In Proceedings of the IEEE Innovative Smart Grid Technologies-Asia (ISGT Asia), Bangalore, India, 10–13 November 2013; pp. 1–6. [Google Scholar]
Bhela, S.; Kekatos, V.; Veeramachaneni, S. Power distribution system observability with smart-meter data. In Proceedings of the IEEE Global Conference on Signal and Information Processing (GlobalSIP), Montreal, QC, Canada, 14–16 November 2017; pp. 1070–1074. [Google Scholar]
Zhu, C.; Reinhardt, A. Reliable Streaming and Synchronization of smart-meter Data over Intermittent Data Connections. In Proceedings of the IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm), Beijing, China, 21–23 October 2019; pp. 1–6. [Google Scholar]
Lu, J.; Liu, X.; Ye, A.; Dou, J.; Zheng, G. Research on advanced metering infrastructure time synchronization based on NTP. In Proceedings of the Chinese Control And Decision Conference (CCDC), Shenyang, China, 9–11 June 2018; pp. 5701–5704. [Google Scholar]
Yassine, A.; Nazari Shirehjini, A.A.; Shirmohammadi, S. Smart-meters Big Data: Game Theoretic Model for Fair Data Sharing in Deregulated Smart Grids. IEEE Access 2015, 3, 2743–2754. [Google Scholar] [CrossRef]
Pappu, S.J.; Bhatt, N.; Pasumarthy, R.; Rajeswaran, A. Identifying Topology of LV Distribution Networks Based on smart-meter Data. IEEE Trans. Smart Grid 2018, 9, 5113–5122. [Google Scholar] [CrossRef]
Zhang, M.; Luan, W.; Guo, S.; Wang, P. Topology Identification Method of Distribution Network Based on smart-meter Measurements. In Proceedings of the China International Conference on Electricity Distribution (CICED), Tianjin, China, 17–19 September 2018; pp. 372–376. [Google Scholar]
Park, S.; Deka, D.; Backhaus, S.; Chertkov, M. Learning with end-users in distribution grids: Topology and parameter estimation. IEEE Trans. Control Netw. Syst. 2020, 7, 1428–1440. [Google Scholar] [CrossRef]
Pengwah, A.B.; Zabihinia Gerdroodbari, Y.; Razzaghi, R.; Andrew, L.L.H. Topology identification of distribution networks with partial smart-meter coverage. IEEE Trans. Power Deliv. 2024, 39, 992–1001. [Google Scholar] [CrossRef]
Pengwah, A.B.; Fang, L.; Razzaghi, R.; Andrew, L.L.H. Topology identification of radial distribution networks using smart-meter data. IEEE Syst. J. 2022, 16, 5708–5719. [Google Scholar] [CrossRef]
Jiao, F.; Li, Z.; Ai, J.; Yang, H.; Deng, Y.; Li, D.; Gao, W.; Lai, Z.; Fu, X. Topology Identification Method for Low-Voltage Distribution Node Networks Based on Density Clustering Using Smart Meter Real-Time Measurement Data. IEEE Access 2024, 12, 83600–83610. [Google Scholar] [CrossRef]
Wang, C.; Lou, Z.; Li, M.; Zhu, C.; Jing, D. Identification of Distribution Network Topology and Line Parameter Based on Smart Meter Measurements. Energies 2024, 17, 830. [Google Scholar] [CrossRef]
García, S.; Fresia, M.; Mora-Merchán, J.M.; Carrasco, A.; Personal, E.; León, C. A data-driven topology identification method for low-voltage distribution networks based on the wavelet transform. Electr. Power Syst. Res. 2025, 243, 111517. [Google Scholar] [CrossRef]
Zhang, H.; Zhao, J.; Wang, X.; Xuan, Y. Low-Voltage Distribution Grid Topology Identification With Latent Tree Model. IEEE Trans. Smart Grid 2022, 13, 2158–2169. [Google Scholar] [CrossRef]
Ma, L.; Wang, L.; Liu, Z. Topology Identification of Distribution Networks Using a Split-EM Based Data-Driven Approach. IEEE Trans. Power Syst. 2022, 37, 2019–2031. [Google Scholar] [CrossRef]
Soumalas, K.; Messinis, G.; Hatziargyriou, N. A data driven approach to distribution network topology identification. In Proceedings of the IEEE Manchester PowerTech, Manchester, UK, 18–22 June 2017; pp. 1–6. [Google Scholar]
Deka, D.; Backhaus, S.; Chertkov, M. Structure Learning in Power Distribution Networks. IEEE Trans. Control Netw. Syst. 2018, 5, 1061–1074. [Google Scholar] [CrossRef]
Liao, Y.; Weng, Y.; Rajagopal, R. Urban distribution grid topology reconstruction via Lasso. In Proceedings of the IEEE Power and Energy Society General Meeting (PESGM), Boston, MA, USA, 17–21 July 2016; pp. 1–5. [Google Scholar]
Sharon, Y.; Annaswamy, A.M.; Motto, A.L.; Chakraborty, A. Topology identification in distribution network with limited measurements. In Proceedings of the IEEE PES Innovative Smart Grid Technologies (ISGT), Washington, DC, USA, 16–20 January 2012; pp. 1–6. [Google Scholar]
IEC. IEC 61000-4-30:2015 Electromagnetic Compatibility (EMC)—Part 4-30: Testing and Measurement Techniques—Power Quality Measurement Methods, 3rd ed.; International Electrotechnical Commission: Geneva, Switzerland, 2015. [Google Scholar]
Othman, A.; Watson, N.R.; Lapthorn, A.; Mukhedkar, R. Generation of Realistic Smartmeter Data. In Proceedings of the IEEE PES Innovative Smart Grid Technologies—Asia (ISGT Asia), Auckland, New Zealand, 21–24 November 2023; pp. 1–5. [Google Scholar]
McKenna, E.; Thomson, M. High-resolution stochastic integrated thermal–electrical domestic demand model. Appl. Energy 2016, 165, 445–461. [Google Scholar] [CrossRef]
PANDA (equiPment hArmoNic DAtabase). Available online: https://www.panda.et.tu-dresden.de (accessed on 6 August 2023).
Xu, X.; Collin, A.J.; Djokic, S.Z.; Langella, R.; Testa, A.; Meyer, J.; Möller, F. Harmonic emission of PV inverters under different voltage supply conditions and operating powers. In Proceedings of the 2016 17th International Conference on Harmonics and Quality of Power (ICHQP), Belo Horizonte, Brazil, 16–19 October 2016; pp. 373–378. [Google Scholar] [CrossRef]
Cormen, T.H.; Leiserson, C.E.; Rivest, R.L.; Stein, C. Introduction to Algorithms, 3rd ed.; MIT Press: Cambridge, MA, USA, 2009. [Google Scholar]

Figure 1. Influential factors on topology identification algorithm.

Figure 2. Single-line diagram of the distribution network under study.

Figure 3. Fundamental voltage for three different nodes.

Figure 4. The voltage THD for three different nodes.

Figure 5. Flowchart of distribution network topology identification algorithm.

Figure 6. Incorrect between clusters connections (

E_{inter}

) and incorrect same cluster connections (

E_{intra}

).

Figure 6. Incorrect between clusters connections (

E_{inter}

) and incorrect same cluster connections (

E_{intra}

).

Figure 7. Estimated network topology based on

V_{r m s}

measurements (with time resolution = 10 min and

μ = 0

). Note: Node IDs and detailed network connections are best viewed by zooming into the digital version of this document.

Figure 7. Estimated network topology based on

V_{r m s}

measurements (with time resolution = 10 min and

μ = 0

). Note: Node IDs and detailed network connections are best viewed by zooming into the digital version of this document.

Figure 8. Estimated network topology based on

THD

measurements (with time resolution = 10 min and

μ = 0

). Note: Node IDs and detailed network connections are best viewed by zooming into the digital version of this document.

Figure 8. Estimated network topology based on

THD

measurements (with time resolution = 10 min and

μ = 0

). Note: Node IDs and detailed network connections are best viewed by zooming into the digital version of this document.

Figure 9. Variation of similarity scores across different harmonic orders and time resolutions: (a) Line graph. (b) Numerical values are presented as a heatmap.

Figure 10. Variation of

E_{inter}

across different harmonic orders and time resolution: (a) Line graph. (b) Numerical values presented as a heatmap.

Figure 10. Variation of

E_{inter}

across different harmonic orders and time resolution: (a) Line graph. (b) Numerical values presented as a heatmap.

Figure 11. Variation of

E_{intra}

across different harmonic orders and time resolution: (a) Line graph. (b) Numerical values are presented as a heatmap.

Figure 11. Variation of

E_{intra}

across different harmonic orders and time resolution: (a) Line graph. (b) Numerical values are presented as a heatmap.

Table 1. The electrical house appliances to be included in the load profiles.

Loads Included in Load-Flow	Loads Excluded From the Load-Flow
Lighting equipment	Washing machine
Hob	Answering machine
Oven	CD player
Electric shower	VCR_DVD
Heat-pump	Clock
Receiver	Fax
TV	Phone
Microwave
Kettle
Small kitchen cooking loads
Freezer
Refrigerator
Personal computer

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Othman, A.; Watson, N.R.; Lapthorn, A.; Mukhedkar, R. Utilising Smart-Meter Harmonic Data for Low-Voltage Network Topology Identification. Energies 2025, 18, 3333. https://doi.org/10.3390/en18133333

AMA Style

Othman A, Watson NR, Lapthorn A, Mukhedkar R. Utilising Smart-Meter Harmonic Data for Low-Voltage Network Topology Identification. Energies. 2025; 18(13):3333. https://doi.org/10.3390/en18133333

Chicago/Turabian Style

Othman, Ali, Neville R. Watson, Andrew Lapthorn, and Radnya Mukhedkar. 2025. "Utilising Smart-Meter Harmonic Data for Low-Voltage Network Topology Identification" Energies 18, no. 13: 3333. https://doi.org/10.3390/en18133333

APA Style

Othman, A., Watson, N. R., Lapthorn, A., & Mukhedkar, R. (2025). Utilising Smart-Meter Harmonic Data for Low-Voltage Network Topology Identification. Energies, 18(13), 3333. https://doi.org/10.3390/en18133333

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Utilising Smart-Meter Harmonic Data for Low-Voltage Network Topology Identification

Abstract

1. Introduction

2. Influential Factors on Topology Identification Algorithm

2.1. Upstream Voltage Variability

2.2. Load Characteristics

2.3. Load Demand Response

2.4. Network Structures

2.4.1. Capacitor Banks and Voltage Regulators

2.4.2. Poor Neutral Conductor Connection

2.5. Consideration of Measurement Characteristics

2.5.1. Data Synchronisation

2.5.2. Metering Errors

2.6. Methods for Analysing Voltage Measurements

2.6.1. Correlation Techniques

2.6.2. Enhancing the Precision of Topology Identification by Integrating the Locational Information of ICPs

3. Methodology for Generating Synthetic Harmonic Measurements

4. Proposed Algorithm for Identifying the Network Topology

4.1. Distance Matrix Calculation Using Pearson Correlation

4.2. Modified Kruskal’s Minimum Spanning Tree Algorithm

4.3. Topological Similarity in Electrical Distribution Networks

5. Results and Discussion

5.1. Initial Visualised Results for Topology Identification Algorithm Performance

5.2. Performance Metrics: Similarity Score

5.3. Performance Metrics: $E_{intra}$ and $E_{inter}$

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Utilising Smart-Meter Harmonic Data for Low-Voltage Network Topology Identification

Abstract

1. Introduction

2. Influential Factors on Topology Identification Algorithm

2.1. Upstream Voltage Variability

2.2. Load Characteristics

2.3. Load Demand Response

2.4. Network Structures

2.4.1. Capacitor Banks and Voltage Regulators

2.4.2. Poor Neutral Conductor Connection

2.5. Consideration of Measurement Characteristics

2.5.1. Data Synchronisation

2.5.2. Metering Errors

2.6. Methods for Analysing Voltage Measurements

2.6.1. Correlation Techniques

2.6.2. Enhancing the Precision of Topology Identification by Integrating the Locational Information of ICPs

3. Methodology for Generating Synthetic Harmonic Measurements

4. Proposed Algorithm for Identifying the Network Topology

4.1. Distance Matrix Calculation Using Pearson Correlation

4.2. Modified Kruskal’s Minimum Spanning Tree Algorithm

4.3. Topological Similarity in Electrical Distribution Networks

5. Results and Discussion

5.1. Initial Visualised Results for Topology Identification Algorithm Performance

5.2. Performance Metrics: Similarity Score

5.3. Performance Metrics: E intra and E inter

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

5.3. Performance Metrics: $E_{intra}$ and $E_{inter}$