Sensitivity-Informed Bayesian Inference for Home PLC Network Models with Unknown Parameters

: Bayesian inference is used to calibrate a bottom-up home PLC network model with unknown loads and wires at frequencies up to 30 MHz. A network topology with over 50 parameters is calibrated using global sensitivity analysis and transitional Markov Chain Monte Carlo (TMCMC). The sensitivity-informed Bayesian inference computes Sobol indices for each network parameter and applies TMCMC to calibrate the most sensitive parameters for a given network topology. A greedy random search with TMCMC is used to reﬁne the discrete random variables of the network. This results in a model that can accurately compute the transfer function despite noisy training data and a high dimensional parameter space. The model is able to infer some parameters of the network used to produce the training data, and accurately computes the transfer function under extrapolative scenarios.


Introduction
Power Line Communications (PLC) are expected to be a key component of smart-grids, allowing grid usage to be monitored without adding extensive infrastructure. PLC systems use existing power lines as two-way communication channels and can be implemented in a broad range of scenarios, including home/office networks, device-specific billing, smart energy management, automatic meter reading, and power outage detection [1][2][3][4][5][6]. PLC systems are classified into three categories based on bandwidth: ultra narrowband (UNB) systems have frequencies ranging from 0.3-3 kHz, narrowband (NB) systems have a frequency range of 3-500 kHz, and broadband (BB) systems have frequencies of 1.8-250 MHz [7]. Lower frequencies are less affected by transmission losses and are therefore more effective over long distances, but have a lower data transmission rate. Devices for remote meter reading and monitoring power grid conditions generally use UNB or NB systems. In contrast, BB systems target applications within a home network such as home automation and control, energy management systems, and decreasing energy consumption and cost [8]. The issue with PLC systems is that because they use existing power lines that are not designed for carrying communication signals, and can have high variability in the loads and wiring, the conditions are highly dependent on the specific network topology and loads. Therefore, PLC systems have to be tweaked for specific environments based on the network frequencies, noise, electromagnetic compatibility, and power source. It is important to develop models that can characterize transmission pathways for specific networks to be able to effectively and reliably transmit communications.
There are two existing methods of modeling transmission pathways: top-down and bottom-up models [9,10]. The top-down modeling approach views conductive propagation as an unknown function to be estimated based on training data, and so does not use any prior information such as the network topology or component connections, but rather implements a parametric model based on transmission line (TL) theory [11]. This model,

PLC Network Topology
The PLC network topology used in this work is a modified version of the network used in Marrocco et al. [16] and models a home network at frequencies up to 10 MHz. The loads in each room are connected through wires to a junction box (JB). There are two types of wiring connection structures connecting the loads to each JB, similar to those used in Marroco et al. [16] and Huang et al. [23]. In a star connection, each load is at the termination of a wire, so a single wire connects the load to the JB. In a bus connection, only two wires are directly connected to the JB, and the other loads are connected via parallel connections to the wiring, except those at the terminations of the two wires. Figure 1 shows schematics of both room types. The JB in each room is connected to a service panel through which power is provided to the house, shown in Figure 2. Two outlets in the house are connected to a transmitter and receiver, respectively, which are used to measure the transfer function of the voltage between them. The transmitter and receiver have constant resistance loads with variable magnitude. The service panel and JBs do not have loads.

Cable Modeling
The cables are modeled using TL theory [24]. The resistance, inductance, capacitance, and conductance (R, L, C, and G, respectively) are used to calculate a propagation constant γ and characteristic impedance Z 0 using Equations (1) and (2).
The wire properties are obtained from Ref. [25] and generalized to the equal radius, two-wire case. Skin depth is defined as: where µ 0 = 4π · 10 −7 [H/m] is the vacuum permeability and σ = 5.8 · 10 7 S/m is the conductivity of copper. The wire properties are given by: where r w is the wire radius in meters, d c is the distance between conductors, and = 3.19 · 10 −11 [F/m] is the dielectric constant of PVC. The distance between conductors d c is assumed to be equal to the insulator diameter. A linear fit is used to approximate d c as a function of r w based on wire gauge sizes between 6 and 16 with Thermoplastic High Heat-resistant Nylon (THHN) insulation. The insulation dimensions are obtained from Ref. [26]. The relation is: In reality, wires are available in wire gauge sizes, and therefore have discrete valued properties rather than continuous valued properties. Various other PLC model studies have used wire properties in discrete wire types [9,16,27,28]. However, it is difficult to infer discrete valued parameters using in a high-dimensional inference setting. Sampling methods such as jump-diffusion [29] and reversible-jump Metropolis-Hastings [30] were developed to infer discrete parameters, but they are extremely computationally expensive in a high-dimensional system. We therefore chose to use a continuous wire model rather than attempting to infer discrete wire parameters. Continuous wire models have also been used in several PLC modeling studies, including Refs. [10,17,31].

Load Modeling
Appliances in home networks are classified as one of four different types: constant impedances, time-invariant but frequency-selective impedances, time-varying harmonic impedances, and time-varying commuted impedances [9]. Various previous studies have developed PLC models using these load types, including Refs. [9,[16][17][18]. The frequencyselective, time-varying harmonic, and time-varying commuted loads all have impedances based on a parallel RLC circuit, but have different variations in time.
Constant impedance loads are characterized by their power P r , which can be converted to resistance by: where U = 120 V in the United States, the voltage provided to the network. The impedance Z of a constant impedance load is equal to the resistance, Z = R. Time-invariant frequencyselective loads are derived from a parallel RLC circuit model [9]. The impedance is given by: where Q f is the quality factor, ω = 2π f , and the resonance frequency is f 0 = ω 0 2π . Both time-varying harmonic and time-varying commuted impedances vary over the mains period, which is 16.67 ms (60 Hz) in North America and 20 ms (50 Hz) in most of the world. Time-varying harmonic impedances are calculated as follows. Z B (ω) is a frequencyselective load calculated with (8), and Z A = 50 Ω. The impedance of a time-varying harmonic load is: where φ is the phase, and t is the discrete time, which is defined with M intervals between 0 and 1. The impedance of a time-varying harmonic load varies between Z A and Z A + Z B over the mains period. The impedance of time-varying commuted loads is defined as follows. Z B (ω) is a frequency-selective load calculated with (8), and Z A (ω) = 0.5Z B (ω). The impedance is: Here, D and T are parameters that define the timing of the impedance variations.

Network Parameter Generation
In the previous sections, we described the equations pertaining to wire and load properties. We now describe the parameters employed for this study. The home appliances modeled here have power P r between 5 W and 5000 W. Appliance power data and approximate usage is obtained from several online sources [32][33][34]. The appliances are grouped into categories based on their power. Table 1 shows the power groups and probabilities that each load belongs to each group. These probabilities are estimated by considering the average usage of appliances in each group. This provides a more realistic load power distribution than a uniform distribution. The load types (constant, frequency-selective, time-varying commuted, time-varying harmonic) are randomly generated with equal probability. The resonant frequency f 0 = ω 0 2π is generated over the range [300 kHz, 30 MHz]. The quality factor Q f has a uniform distribution over [5,25]. For time-varying harmonic loads, the phase φ has a uniform distribution over [0, π]. For time-varying commuted loads, T has a uniform distribution over [0, M/4], and D has a uniform distribution over [0, M/2 − T]. For both harmonic and commuted time-varying loads, Z B is a frequency-selective load with randomly chosen P r , f 0 , Q f as described earlier.
The wires between the service panel and junction boxes are randomly generated with radii in the range [1.03, 2.06] mm (corresponding to wire gauge size between 12 and 6) and have length in the range [5,20] m. The wires in the rooms are randomly generated with radii in the range [0.81, 1.29] mm (corresponding to wire gauge size between 10 and 14) and have length in the range [2,5] The model network is solved over a frequency range from 300 kHz to 30 MHz, with a frequency resolution of 300 kHz. In the time-domain, M = 20; however, both time-varying loads are periodic with a period half the mains period, so solving the network over a discrete time t between 0 and 1/2 of the mains period is sufficient.
The network topology used here is shown in Figure 2 and has four rooms, each with four outlets. Two of the rooms have star connections, and the other two have bus connections. The transmitter and receiver are connected to outlets in different rooms, and all outlets except those containing the transmitter and receiver have loads. The loads are labeled Lx-y, where x is a number corresponding to the room and y designates the load number in the room.

Modeling
Several mathematical tools are used here for modeling a network frequency response, and a summary of each is given in this section.

Network Simulation
The channel transfer function H is calculated from a fully defined network using two-port network theory [24,35]. The channel transfer function is defined as the ratio of the transmitter voltage V trans and the receiver voltage V rec .
The transfer function is computed using ABCD matrices. The matrix elements corresponding to a cable are calculated from Z 0 and γ as: These components are computed for each cable and load, and are then combined for all the loads and wires between the transmitter and receiver. The channel transfer function (in decibels) is then calculated using (13) [16].
where A , B , C , D are the matrix elements of the combined ABCD matrix, R trans is the transmitter impedance, and R rec is the receiver impedance. See the Appendix A for more details on how the transfer function is computed using two-port networks. The network solver was written in Python and interfaced with the Bayesian inference code. The solver was verified against the code of Cañete et al. [9] for a simplified network.

Bayesian Inference
We employ a probabilistic framework to estimate model parameters based on data [21]. Specifically, we cast the inference in a Bayesian framework, which helps update our prior knowledge about the model parameters by comparing the model predictions with the available data. From the Bayes rule: where d ∈ R N d is the observed data, θ is the set of model parameters, and p( ) designates a probability. p(θ) is called the prior probability and encodes any prior knowledge, including expert opinion, one might have about the model in the context of a particular application. p(d|θ), called the likelihood, measures the agreement between the model and the data for select parameter values. Close agreement between the model and the data will result in higher values for the posterior p(θ|d). For this work, the prior was set to a uniform distribution between the bounds set for the model parameters. We employ a Gaussian approximation for the likelihood: where y ∈ R N d is the output of the proposed, and N (µ, σ 2 ) denotes a normal distribution with mean µ and variance σ 2 . Since σ is unknown, it is inferred as one of the parameters.
To apply Bayesian inference, parameter samples must be drawn from the prior distribution and the likelihood evaluated via (15). If the samples are drawn randomly from the prior distribution, it can take a prohibitively large number of samples to reasonably approximate the posterior, unless the number of parameters N θ is very small. Instead, Markov chain Monte Carlo (MCMC) methods are generally used to draw samples from the distribution of p(d|θ). MCMC methods take a sample and propose a modification to the sample, then estimate the likelihood at the proposed point. The proposed sample can be accepted or rejected depending on the posterior value at the new location. Therefore, samples are more likely be drawn to regions of parameter space with high posterior values. For details, see Refs. [21,36].
There are a variety of MCMC methods that could be used. Single chain methods such as the Metropolis-Hastings method [37], adaptive Metropolis (AM) [38], or delayed rejection adaptive Metropolis (DRAM) [39] update a single sample many times, making a single chain. This work uses transitional Markov chain Monte Carlo (TMCMC), which has been shown to be more robust for high dimensional parameter spaces than single chain methods [22,40,41]. TMCMC uses a series of intermediate distributions, given in (16), to transition a set of samples from the prior to the posterior distribution: β j is a parameter that is increased from 0 to 1 to transition the samples gradually from the prior to the posterior. This gradual change prevents the samples from immediately clustering in locally favorable regions without exploring a larger region of parameter space. For the PLC networks presented here, the parameters θ correspond to the wire length l, wire radius r w , load power P r , resonance frequency f 0 , phase φ, load parameters D and T, and the standard deviation σ for the discrepancy between the model and the data. There are additional network parameters Q f , but a sensitivity analysis showed that they had marginal effects on the transfer function, so they are ignored when performing Bayesian inference to keep the parameter space lower dimensional. The load types can also be considered parameters, but they only take on discrete values and are not directly calibrated with TMCMC. Therefore, the load types are considered a separate set of discrete parameters Θ and the parameters that take on continuous values are the continuous parameters θ. The continuous parameters have a direct dependence on the discrete parameters, so θ(Θ). Furthermore, the number of continuous parameters θ is dependent on the discrete parameters Θ because a load will have a different number of continuous parameters depending on the load type (constant loads have one continuous parameter, frequency selective loads have two, time-varying harmonic loads have three, and time-varying commuted loads have four).
Because of the dependence of continuous parameters on discrete parameters, θ(Θ), for this work we will infer θ for fixed Θ choices. The Bayesian inference update rule, in (14), becomes: For simplicity, and since we are not comparing models across different sets of Θ, the dependence of continuous parameters on the discrete parameters is left out of the notation in this work.

Sensitivity Analysis
We employ Global Sensitivity Analysis (GSA) to estimate how the change in network modeling output can be apportioned to changes in the continuous parameters θ. With the focus on statistical model calibration, we employ variance-decomposition methods where the total output variance is decomposed into fractions associated with the model parameters and their interactions.
The effects of input parameters θ = {θ 1 , . . . , θ N θ } and their interactions on a model output M, are quantified through Sobol indices [42][43][44]. The first order Sobol indices are given by: where is the expectation with respect to θ ∼i , and Var θ i [·] is the variance with respect to θ i . Note that, in this context, subscript i can denote one parameter or a group of parameters. The first-order Sobol indices estimate the fractional contribution of each input parameter θ i to the total variance of the model output. Sobol indices are computed using UQ Toolkit [45], an open-source library for uncertainty quantification. The UQ Toolkit estimates Sobol indices using Monte Carlo (MC) algorithms proposed by Ref. [46].

Sensitivity-Informed Bayesian Inference
The network topologies in this work generally have over 50 continuous parameters θ that could be optimized, but a 50 parameter space is too high-dimensional to directly pursue a statistical inference study. Many of the parameters are for loads and wires far from the main path between the transmitter and receiver, and likely have an negligible effect on the transfer function. The objective of sensitivity-informed Bayesian inference is to perform Bayesian inference only on parameters that have a significant effect on the transfer function. We use Sobol indices to calculate how sensitive the frequency response is to each continuous parameter and only perform Bayesian inference on the d t most sensitive parameters. For a network with randomly generated discrete parameters Θ, Sobol indices are computed for all continuous parameters θ, then Bayesian inference is performed on a subset of θ designated as the trainable parameters θ t ⊂ θ, consisting of the d t most sensitive parameters.
Sensitivity-informed Bayesian inference can only calibrate θ for assumed Θ. Therefore, random search methods are applied to identify the discrete parameters Θ in a threepart process. First, sensitivity-informed Bayesian inference is applied to N r 1 different realizations of the network, each with different randomly generated Θ. The different realizations are independent and can be run in parallel to speed up the computational time. In the second part, further refinements are made with a random greedy search on the discrete parameters. For N r 2 iterations, a single discrete parameter Θ j is randomly regenerated, and Bayesian inference is performed only on the dependent parameters θ(Θ j ). Every third iteration, three random parameters θ are selected and inference is performed on them to ensure that the wires length and radius are also updated, as those parameters are not dependent on discrete parameters Θ. At each stage, the new model is saved only if it is more accurate than the previous model, making it a greedy algorithm. In the third part, a single run of sensitivity-informed Bayesian inference is performed to ensure the most sensitive parameters are calibrated and to obtain an accurate sample distribution for error analysis.
To compare different calibrated models, the continuous ranked probability score (CRPS) is used [47][48][49]. CRPS provides an aggregate value expressing a prediction error for probabilistic outputs. The advantage of the CRPS over simply using the standard deviation is that it accounts for non-Gaussian distributions. See Ref. [49] for details.
The full algorithm for calibrating the network model is given in Algorithm 1. To summarize Algorithm 1, in the initial inference step, sensitivity-informed Bayesian inference is run on a number of randomly generated network realizations. In the random search step, the most accurate model is repetitively modified, and Bayesian inference is applied only to a few parameters. In the final refinement step, sensitivity-informed Bayesian inference is applied to the most accurate model to generate parameter samples from the posterior distribution for accurate error analysis. The model calibration is all done in an offline stage, where computational resources are not a significant barrier. Model predictions of the calibrated model can then be made in an online stage very quickly.

Predictive Assessment
We will employ both pushed-forward distributions and Bayesian posterior-predictive distributions [50] to assess the predictive skill of the transfer function. The pushed-forward distribution is the model distribution computed directly from the samples, and the Bayesian posterior-predictive distribution is the model distribution inclusive of the statistical discrepancy between the model and the data (and therefore accounts for model form and/or measurements errors). The process used to generate push-forward posterior estimates is formalized as: Here, d (pf) denotes hypothetical data and p pf (d (pf) |d) denotes the push-forward probability density of the hypothetical data d (pf) conditioned on the observed data d. We start with samples from the posterior distribution p(θ|d). These samples are readily available from the TMCMC exploration of the parameter space. Using these samples, we evaluate the transfer function H and collect the resulting d (pf) = H(θ) samples that correspond to the push-forward posterior distribution p pf (d (pf) |d).
The pushed-forward posterior does not account for the discrepancy between the data d and the model predictions, subsumed into the definition of the likelihood function in (15). The Bayesian posterior-predictive distribution, defined in (20), is computed by marginalization of the likelihood over the posterior distribution of model parameters θ: In practice, we estimate p pp y (pp) |d through sampling. The sampling workflow is similar to the one shown in (19). After the model evaluations d = H(θ) are completed, we add random noise consistent with the likelihood model settings presented in (15). The resulting samples are used to compute summary statistics of p pp d (pp) |d . For example, the modeled transfer function is the mean of p pp d (pp) |d .

Results
All results shown here are for the network topology shown in Figure 2. Bayesian inference is done using the network frequency response for four combinations of the transmitter and receiver resistances at 20 Ω and 100 Ω. The frequency response computed at the different resistances and different times is concatenated to form the model output. To test the inference methodology, it is more appropriate to use simulation data rather than actual measurements in order that the parameters of the model are known exactly. Therefore, the "true" network transfer function is calculated with a set of randomly generated Θ and θ. Noise is added to the "true" transfer function to simulate experimental measurement noise. The noise is zero mean Gaussian noise with a standard deviation on 0.1 dB. The resulting data d is then used to calibrate the network model.
There are several algorithm parameters that must be determined, namely N s 1 , N s 2 , N r 1 , N r 2 , and d t . A convergence study was done to ensure that a sufficient number of samples is used in TMCMC and that a sufficient number of parameters are optimized during the Bayesian inference. The convergence study found that using N s 1 = 30, 000, N s 2 = 300 and optimizing the d t = 20 most sensitive parameters resulted in accurate inferences. The number of realizations were set at N r 1 = 10 and N r 2 = 200. A larger number of realizations did not improve the model. For computing Sobol indices, 3000 samples were drawn for each dimension of parameter space. For brevity the convergence study is not shown here. All results shown are from the final refinement step of Algorithm 1.
This section is organized as follows: Section 4.1 examines Sobol indices of parameters from a baseline case, and Section 4.2 examines marginal distributions in the parameter space. Section 4.3 shows results on the baseline case and Section 4.4 shows results in an extrapolative scenario. Section 4.5 shows results from 30 different realizations to show how reliably network parameters are inferred.

Sensitivity Analysis
The Sobol indices are computed in the final refinement step of Algorithm 1 to identify the most sensitive parameters. Table 2 shows the 20 most sensitive parameters for the calibrated model of a single network realization along with the corresponding Sobol indices. The two most sensitive parameters are the load power and resonance frequency of load L4-3, which is the only load directly adjacent to the main path between transmitter and receiver. All other parameters have Sobol indices below 0.021, but the 20 most sensitive parameters have Sobol indices about 0.005. Note that parameters with a low Sobol index can still be important to the model. Sobol indices measure the importance of each parameter to the overall output variance, and the L4-3 load power is most important in determining the magnitude of the transfer function, i.e., changing load power will change the magnitude, but not significantly affect the shape of the transfer function. However, the other parameters need to be calibrated to accurately model the frequency dependence of the transfer function. Table 2. Parameters in Figure 3. The node names are from Figure 2.

Marginal Parameter Distributions
The Bayesian methods used to calibrate the PLC model are equipped to deal with heterogeneous sources of uncertainty, in this case from our modeling assumptions, i.e., model and parametric uncertainties, as well as experimental errors. The samples drawn from the posterior can be used to examine the correlations between model parameters and the impact of modeling assumptions on the predictive capabilities of the model. Here, the samples are used to compute marginal density functions using kernel density estimation [51,52] with a Gaussian kernel.
The 1D and 2D marginal density functions are shown in Figure 3 for the 20 most sensitive parameters listed in Table 2. The logarithm of σ is included as the final parameter. The plots on the diagonal of Figure 3 show the 1D density functions normalized such that the values are between 0 and 1. The off-diagonal plots show the 2D joint marginals for pairs of parameters. Some of the off-diagonal plots exhibit a skewed marginal density, suggesting some level of correlation for the corresponding pair of parameters. For example, parameters 4 and 15 (fifteenth row, fourth column) show a negative correlation. These parameters are wire lengths of adjacent wires, so it is not surprising that they are correlated. There are also correlations between the load power and frequency of load L4-3 (second row, first column), as well as several wires connected to the service panel (θ 6 , θ 8 ). Uncertainty quantification methods often assume uncorrelated parameter distributions, which sometimes leads to summary statistics that are wider than expected. Accounting for correlations or dependencies between model parameters in a Bayesian framework will lead to more accurate error bounds.  Figure 4 shows the modeled transfer function computed from the calibrated model at R rec = R trans = 100 Ω. The error bounds shown are the 95% confidence intervals computed from the standard deviation of p pp y (pp) |d . The calibrated model reproduces the true data reasonably well despite being trained on the noisy data. Slight discrepancies are observed at approximately 15 MHz, but the modeled transfer function generally captures the peak frequencies accurately.   It is of interest to know how accurate the calibrated model is at identifying load and wire parameters. Table 3 shows the load types and several load parameters (load power and resonance frequency) for the true network and calibrated model. The model correctly inferred the load type of L4-3 and accurately identified the load power and resonance frequency for that load. However, the load types are less accurately identified for most other loads. Note that load L4-3 is the only load directly adjacent to the main path between the transmitter and receiver. Overall, eight of the fourteen (57%) loads types are modeled correctly, but if the model randomly chose load types, only 25% of them would be correct. The load parameters of most loads are highly inaccurate as a result of the load types being inaccurate. A similar comparison of wire radii and wire lengths is shown in Table 4. The error shown is normalized by the total range, shown in Equation (21). This is done because the wires can have different lengths depending on the position in the network, as explained in Section 2.3. It should be noted that the %Error r can vary between 0% and 100%, and has a mean of 33.3% if both the true and model values are chosen randomly. The error normalized by the range therefore gives more information on whether parameters are actually inferred than error normalized by the true value. Table 4 shows that for most wires, the model is poor at identifying both radius and length. However, the model identifies the wire length with reasonable accuracy for four wires connected to the service panel. This is a high dimensional system and with noisy training data, so it is unsurprising that the model incorrectly identifies many of the parameters.

Model Predictions at Other Resistances
A more rigorous test of the calibrated model is determining its accuracy when extrapolating. The calibrated network model is physics-based, meaning the model uses equations based on the underlying physics of the network topology and TL theory. Physics-based models have been found to perform better than basic data-fitting methods in many physical science applications [53]. One benefit of physics-based models is that they generally perform better when the inputs are outside the range of the training data, whereas non-physics-based data fitting methods such as machine learning generally give highly inaccurate predictions under extrapolative scenarios.
In this section, the calibrated model is tested with different transmitter and receiver resistances in extrapolative scenarios. In Figure 6, the transfer function of the inferred model is calculated with four transmitter and receiver resistance combinations at resistances of 10 Ω and 400 Ω. Despite being trained on transmitter/receiver resistances at 20 Ω and 100 Ω, the model is very accurate at the other resistances. Slight inaccuracies are observed in the 15-20 MHz range, particularly for the R trans = 400 Ω, R rec = 10 Ω case. However, the model generally gives very accurate predictions within the model error bounds. For comparison, the transfer functions for four cases when the transmitter and receiver resistances are equal are plotted together in Figure 7. A non-physics-based model would have to predict the transfer function at R trans = R rec = 10 and R trans = R rec = 400 based on the other two transfer functions, whereas the model presented here can use data when the transmitter and receiver resistances are not equal to improve the model. Note that there are many frequencies when trends between different resistances are difficult or impossible to predict just based on the 20 Ω and 100 Ω cases; for example at 3 MHz, the transfer function at R trans = R rec = 100 Ω and 20 Ω are almost equal, but is lower at 400 Ω and 10 Ω. This trend would be impossible to predict using non-physics-based models with data only at 20 Ω and 100 Ω, but is accurately predicted using the calibrated model.

Statistical Analysis of Multiple Network Realizations
The results shown previously all use a single realization of the network to analyze how accurately the model parameters are inferred. Here calibrations are done for 30 different realizations of the network to analyze statistically how accurately the calibration infers network parameters. Figure 8 shows the true and modeled transfer function at R trans = R rec = 100 Ω for the first ten realizations. In each realization, the modeled transfer function accurately reproduces the true transfer function, showing that the calibration method is robust. From the earlier analysis of a single realization, the only load that was accurately inferred was load L4-3, so it is of particular interest to know how well the network infers that load. Table 5 shows the load type, power, and resonance frequency of load L4-3 for the first 15 realizations (for brevity only 15 are shown). Most load type and resonance frequencies are estimated well by the model. The modeled load power is less accurately inferred, but is always within a factor of 8×. For reference, the load power can vary by a factor of 1000×, so load power is reasonably well inferred. Out of the 30 realizations, the load type was correctly inferred in 23 of them (77%).  Table 6 shows the %Error r for wire radius and wire length averaged over the 30 realizations. The 95% confidence intervals of %Error r are ±8% for both wire radius and wire length, calculated from the standard deviation of %Error r and averaging the confidence intervals over every wire. Table 6 indicates that the calibration is able to infer information about wire radius only for wires connected to the service panel. However, information on wire length is inferred for wires connected to the service panel, transmitter, and receiver. The average %Error r for all wires except those connected to the service panel, transmitter, and receiver is 32.3% for wire radius and 28.6% for wire length. Since %Error r is 33.3% for random guessing, this indicates that the method presented here is unable to infer wire radius for any of the wires not connected to the service panel, transmitter, and receiver, and is not consistently inferring wire length data for those wires.
Though the calibration is able to infer wire parameters of the wires connected to the service panel, transmitter, and receiver, they are less accurate than parameters of load L4-3. This is likely because there are unobserved latent variables in the network that have more control over the transfer function. The wires are likely closely coupled throughout the network in such a way that the network transfer function is accurate even though individual wires may not be inferred accurately. For example, the wires connecting the service panel to nodes JB1 and JB4 are both on the main path between transmitter and receiver, so it is possible that for certain realizations, the combined length of those two wires is more important than the individual lengths. The average absolute error |True − Model| for the length of wire SP-JB1 is 3.9 m and the average error for wire SP-JB4 is 3.6 m. However, the average error when the wire lengths are added is 3.7 m, even though the combined wire length has a larger range of 10-40 m instead of 5-20 m for the single wires, and therefore the error on the sum of the wire lengths would be larger if the wire lengths are independent.
This analysis has shown that the calibrated model can infer some network parameters, mainly those on the main path between transmitter and receiver. Despite inaccuracies in the inferred parameters, the calibrated model accurately computes the transfer function for the network even under extrapolative scenarios.

Conclusions
Bayesian inference is used to calibrate a home network PLC model when the loads and wires are unknown. The model uses TL theory to calculate the network transfer function, resulting in a physics-based model similar to bottom-up models, but that can be calibrated to specific networks. The calibration uses Sobol indices to select the most sensitive parameters and then applies Bayesian inference on the selected parameters. Further refinements to calibrate the load and wire types are done through a greedy random search that regenerates individual load types and applies Bayesian inference on the dependent continuous parameters.
The resulting model can accurately reproduce the true transfer function despite being trained on noisy data. The model is also shown to be accurate in extrapolative scenarios. The calibrated model is most accurate at inferring the load closest to the main path between transmitter and receiver, and it can also infer some knowledge of wire parameters also on the main path. However, many of the other model parameters are inaccurate due to several reasons. The model is high dimensional, containing 15 loads and 20 wires, each of which contain one to five parameters. The training data is a vector of dimension 4000, but most of the points are highly correlated, so the training data can be considered fairly low dimensional. As a result, there are likely many model parameter combinations that satisfy or nearly satisfy the training data. In addition, the training data has noise, so even the correct parameters from the true model may not be optimal for the training data. Finally, the sensitivity-informed Bayesian inference method presented here can only identify continuous parameters accurately, but the load types (discrete parameters) are identified through random searches which are unlikely to find the true load and wire types.
There are a number of possible directions for future works to take. Future works could target improving the calibration process, as the method presented here has shortcomings; in particular, the method of identifying load types could use improvement. Future works could also use more realistic network topology models as well as more complex loads. The wire model used here is very simple, so more complex wire models with discrete valued parameters could be used. More complex PLC models could be used that incorporate noise and signal filtering. In addition, models could incorporate signal couplers between low voltage and medium voltage networks. Future studies could use the calibrated model to examine the effects of disturbances in the power grid such as voltage dips and brief power outages. Future works could also target gaining more information from the network to improve the model, and developing models when the network topology is only partially known. Future work is planned to take measurements and use the method presented here to develop a model. Funding: This work was supported by Defense Nuclear Nonproliferation Research and Development. Sandia National Laboratories is a multimission laboratory managed and operated by National Technology and Engineering Solutions of Sandia, LLC., a wholly owned subsidiary of Honeywell International, Inc., for the U.S. Department of Energy's National Nuclear Security Administration under contract DE-NA-0003525. This paper describes objective technical results and analysis. Any subjective views or opinions that might be expressed in the paper do not necessarily represent the views of the U.S. Department of Energy or the United States Government.

Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.
A n B n C n D n V n−1 I n−1 (A2) A node connecting to multiple other nodes containing ABCD matrices ( Figure A2) can be combined into the main path: Figure A2. Schematic of combining ABCD matrices in parallel.
A series load with impedance Z can be expressed as a two-port network with matrix: