Capacitated Multicommodity Flow Problem for Heterogeneous Smart Electricity Metering Communications Using Column Generation

: This paper addresses the planning and deployment of wireless heterogeneous networks (WHNs) for smart metering, based on a cross-layer solution. We combine the constraints of the network layer that considers routing and ﬂow demands at each link in the WHN, while at the same time, we account for the restrictions of the physical layer referred to the capacity of a short range technology when used in a multi-hop fashion. We propose a model based on a column generation approach to solve the capacitated multicommodity ﬂow problem (CMCF); the model includes wireless links capacities, coverage, and cost. The work integrates the multi-hop routing of packets in a mesh network formed by smart meters and concentrators connected to a cellular network via base stations. The trafﬁc of each link is represented in a multigraph with the occupation percentage, and we build a scalable routing tree on a georeferenced map to represent a real deployment. The results describe the behavior of the proposed model in terms of the trafﬁc load per concentrator, the network coverage, and the reduction of energy consumption. We demonstrate that an infrastructure cost reduction is achieved with the inclusion of multi-hop short range technology, which reduces the number of smart meters that require a direct connection to cellular technology. The model guarantees 100% coverage of the smart meters analyzed in each scenario. The calculation time of the CMCF for advanced metering infrastructure (CMCF-AMI) based on the column generation algorithm as the population increases is reduced by 10%, and this is the expected return when the population is considerable.

In the context of heterogeneous networking for AMI, we present a network model that employs different wireless technologies for short range connectivity, i.e., for enabling the smart meters (SM) to relay packets in a multi-hop fashion toward a concentrator, which in turn connects to a cellular network for communications to/from the utility. The smart meter may incorporate different technologies such as WiFi, LoRa, and IEEE 802. 15.4. In this way, the device chooses a short range technology to connect and relay packets from other smart meters using multi-hop routing. In addition, selected smart meters also play the role of concentrators, so that they become universal data aggregation points (UDAP), relaying the aggregated data to a base station via a wide range technology (e.g., 3G, 4G, 4.5G, and 5G).
The hybridization of technologies helps reduce the costs of the AMI deployment and operation since the data transmission via a short range technology is usually less expensive than a wide range technology such as the cellular network. Thus, this work seeks to reduce the cost of the use of the wide range technology by leveraging the use of multi-hop communications; the proposed mechanism guarantees coverage and considers the capacity restrictions of the heterogeneous radio access network [6][7][8][9].
To find an optimization model that solves the problem of planning and deploying WHN for smart metering, here we propose a cross-layer solution. We combine the constraints of the network layer including the flow demands at each link and the routing of flows over the WHN, while at the same time, we account for the restrictions of the physical layer referred to the capacity of a short range technology when used in a multi-hop fashion. The problem corresponds to the capacitated multicommodity flow problem, which is addressed through a heuristic approach based on the generation of columns. The column generation algorithm has been determined in other works as a combinatorial NP-complete optimization problem [10,11]. Our proposed methodology is named capacitated multicommodity flow for advanced metering infrastructure (CMCF-AMI).
Previous works employed optimization problems to model a system that guarantees the coverage of smart meters in a heterogeneous network or to reduce the energy consumption and latency of end-to-end communications [12][13][14][15][16]. However, such works did not take into consideration the capacity of the wireless links, especially when restricted short range technologies were in use. Moreover, the existent works are rigid in terms of the location of the data aggregation points, even for the different solutions that algorithms like K-means or K-medoids deliver. Therefore, the contributions of this paper are the following: • We provide a scalable solution that considers a growing population of smart meters, in a way that the network may be flexible regarding the future locations of the nodes selected to be UDAPs; • We provide a model that takes into account both the wireless links capacities and the capacity demands from the active AMI traffic flows; and • We demonstrate that it is possible to reduce infrastructure costs in the heterogeneous smart metering network with the proposed CMCF-AMI methodology for a neighborhood area network (NAN).
The remainder of this article is organized as follows: Section 2 presents the related work. In Section 3, we formalize the problem and present our proposed CMCF-AMI heuristic solution. Section 4 presents the results and provides the performance analysis. Finally, Section 5 provides concluding remarks.

Related Work
The study of communication networks for the AMI has taken great importance in the scientific field due to the need to reduce deployment and operational costs. Along that line, wireless networks have been proposed to provide communications for electric power smart metering devices [17]. Moreover, WHNs are a good candidate due to their rapid deployment and extended coverage. However, when different wireless technologies are combined within the same AMI, finding the optimal combination of a reduced-cost deployment may form a combinatorial problem that is NP-complete.
Given small AMI networks, with a small number of smart meters (SMs) and concentrators (UDAPs) (e.g., 10 to 30 nodes), the optimization problem falls into a convex area and can be solved using linear programming. In [18], the authors proposed solutions that required reduced computational and time resources, aiming to achieve an optimal routing considering a set of feasible links placed in a small mesh network; such a scenario does not require a column generation algorithm or heuristic. Similar situations were shown in [19,20], which could also consider multi-hop routing and resource allocation problems. Other optimization techniques have been explored, including scheduling, routing, and matching problems [7].
However, when the AMI network requires scalability, in particular with respect to the gradual increase in the number of SMs and UDAPs, the optimization problem falls into a non-convex area. In this situation, the column generation method provides appropriate solutions to the optimization problem, which can be further enhanced if the scenario uses georeferenced locations with actual distances, i.e., employing the haversine distance.
Works addressing the optimization of WHN in AMI networks were presented in [12,13], which described a clustering scheme and the subsequent routing to guarantee the coverage of the smart meters; however, only the capacity of the concentrator was considered in the optimization problem. Other theoretical models have been proposed to consider additional restrictions on the deployment of a wireless network and have verified the behavior of the model in terms of performance, but have not been tested in realistic scenarios using the georeferenced information of smart meters [21,22].
In the general field of routing in wireless networks, some researchers have proposed the use of the column generation method. This method creates columns in a first stage to solve the sub-problem of finding the shortest route and looks for alternative routes with a certain cost; then, it proceeds to verify the viability of the proposed routes in terms of traffic load and other restrictions that could be introduced in the model for the deployment of wireless heterogeneous networks. Previous works along that line were presented in [23][24][25][26][27]. The literature points out that such a technique surpasses the Tabu search method in terms of the reliability of the solution and computational cost [28]. Accordingly, the column generation method has been employed in a network of flows [29,30], where each link had a defined capacity and the resources assigned were defined from a source node to a destination node, according to a specific demand. A fraction of the information flow could be routed through another link existing in the original topology of the wireless network; therefore, there may be multiple paths that could be used to carry the total or a fraction of the information flows. Some of the restrictions employed in the literature are link capacity, conservation of the flow through the feasible nodes of transit, and conservation of the flow of the input node equal to the flow of the exit node [31][32][33][34].
In the smart metering context supported by a WHN, the optimal routing of the information depends on the capabilities of each wireless link, where the links capacities are determined by the allocation of resources depending on the communications technology. Optimal network performance can only be achieved through simultaneous optimization of routing and resource allocation [35][36][37][38]. The literature establishes that the method of the generation of columns associated with the resolution of the problem of flow capacity by multiple resources achieves the optimal coordination of data routing in the network layer and the allocation of resources in the physical layer through the fixing of the cost in the bonding capacities; thus providing a cross-layer solution [39][40][41].
An initial solution that addressed the scalability feature of a multi-hop AMI network was presented in [6], which considered a growing population of SMs, a restricted capacity and chargeability in the UDAPs, as well as the target coverage of the network. However, the authors did not consider the wireless link capacity in the problem, which made it difficult to assess the applicability of restricted wireless technologies to achieve a reduced deployment cost. Furthermore, with a large population of smart meters, it becomes necessary to provide the re-routing of information any time a link becomes congested or when it fails.
Different from previous contributions [42][43][44], the current work focuses on the planning of a WHN in a real scenario, including restrictions related to the wireless technology capacity and the flexibility to incorporate different wireless technologies. This proposal also finds a set of routes that could link each smart meter, based on the chargeability, the flows within the routing tree, and the alternatives to minimize the total cost of information transported by the paths that exist in the feasible set of solutions; all based on a real georeferenced scenario. Table 1 summarizes the comparison of the proposed CMCF-AMI solution with the previous works. The novel contribution of CMCF-AMI is to establish a multihop WHN that uses cross-layer information to determine the network layer routing, involving the decision aspects such as wireless link capacity, traffic flow demands, and energy consumption of the wireless nodes.

Problem Definition and Proposed Solution
In previous works, the disadvantages of using a single type of technology to provide communications to the AMI network were established [6,46]; therefore, it is important to find a hybrid solution that allows improving the connectivity reliability of the smart meters (SMs) through the deployment of a WHN. In this section, we work with a WHN for the collection of energy consumption measurements and propose a heuristic model to optimize the number and locations of concentrators of traffic (i.e., the UDAPs) in an attempt to provide a planning mechanism that is flexible, scalable, and feasible to be employed in a real deployment.
The present work is based on the algorithm of column generation, which is commonly used in mixed integer linear programming (MILP) problems. The column generation algorithm first defines a main problem with the correspondent constraints; such a problem has a reduced number of variables, and the new variables are located at the base solved with a simplex method. Then, the algorithm finds a column with a negative cost to be added to the main problem. All of the process works with iterations, which are finished when there are no new columns to be added to the main problem.
Following the aforementioned methodology, we first specify the notation of the variables and the parameters employed in the formulation of the problem. Each parameter is detailed in Table 2. The objective function, which seeks to minimize the total cost of the capacitated multicommodity flow problem (CMCF-AMI), is composed of three global summations that involve links capacities, traffic flows, and paths, as defined by Equation 1. The elements taken into account are the link costs C e , the flow i through the path l, and the path of the flow i traversing the link e.
The objective function is subject to: • The sum of flows i through the path l must be equal to ϕ i , which is described by Equation (2).
• The sum of all flows routed over a link should not exceed the link's capacity, which is described by Equation (3). Table 2. Notations of the problem formulation.

Variable Definition
The sum of flows i in the path Cap e Link capacity The heuristic based on the column generation algorithm allows us to find the use of the link and its occupation as a percentage. To find the balance between deployment costs and the coverage of smart meters, it is important to take into account the information flows that will travel in each link of the network; according to that, the non-linearity makes the WHN design non-trivial.
We must define the set of combinations of N smart meters connected with a wireless technology and the increase of available links as the population of smart meters grows. To cope with this problem, the column generation method proposes to divide the searching to first find the smart meters that can contribute to the solution by being UDAPs, that is these smart meters will be directly connected to the cellular network, and may concentrate other smart meters coming from multi-hop connections using a short range technology. To achieve such a task, we designed an algorithm that performs the role of the main observer of the primary problem. This algorithm looks for the smart meters that contribute the most to the solution. After that, the best value found in the primary problem enters a secondary problem defined as the finding of the link with the greatest congestion and its elimination from the possible routing options. Consequently, the original routing is modified, and the execution of the primary problem is needed again until the method finds the best option with the best flow balance over the proposed routing.
The novelty of this proposal lies in the integration of link capacities and information flows in the planning of the deployment of heterogeneous wireless networks for smart metering. In the proposed heuristic model, from the main observer, one can optimize the balance of flows of each link by testing almost all possible combinations. The model is presented as a multigraph and is illustrated in Figure 1. The secondary problem is no less important because it contributes to both finding the congested nodes and evaluating if there is a better route with a balance of flows of greater equity. The division strategy presented as a principal observer with primary and secondary problems allowed us to have large reductions in calculation times of the optimization scenario. In fact, the best performance of the column generation method occurs when the number of variables is very large. In our case, one variable was assigned to each possible route; thus, the number of variables was of the factorial order by the number of places to connect. The main observer is only concerned with presenting the variables to the primary problem, which is responsible for dividing a subset of variables. Algebraically, this means that the variables are divided into groups, that is the primal problem names its elements as generated columns, whereas the variables of the dual problem (or second group) are non-generated columns. The primal problem of the primary problem only knows the correct formulation without non-generated columns; hence, the sub-problem proposes a new variable to add to the formulation, and in this way, the primal problem is modified with the addition of a column to the matrix.
In addition, a sensitivity analysis is incorporated in the method; this way, the linear programming identifies the congestion of the optimization. To implement the analysis, the concept of duality is used, which means that, given the optimal dual variables, the primal problem can relate the potential improvement of a variable not generated by observing the cost of the variable and the column of the matrix corresponding to that variable. Then, it is possible to define the variation of the objective value, where x is the primal optimal solution and y is the dual optimal solution of the simplex method.
The concept of duality indicates that the base solution is more important in relation to the extreme values. In addition, the multigraph indicates that a scenario is too big to be implemented in its entirety, but a good result can be obtained with a linear programming that takes into account the restrictions defined for the problem. In this way, although the simplex method has no knowledge of the entire graph, through the revised simplex function, one can enumerate the neighboring SMs to find the next best SM to be visited. In other words, when the model has visited all the routes and there is no other route to analyze, the solution is close to the optimum.
The variables employed in the heuristic process are presented in Table 3. The CMCF-AMI heuristic process is illustrated in Figure 2 and explained as follows.
The main observer verifies the objective values of the primary problem ϕ (see Algorithm 1). Algorithm 1 employs the revised simplex method to obtain the column generation. The secondary problem Ψ (see Algorithm 2) is responsible for testing several topologies to find the congested links and defining other options that are feasible. In addition, the secondary problem requires two additional processes: one responsible for generating the matrices and weights for capacities, costs, distances, and links, presented in Algorithm 3, and the other is in charge of delivering the new topologies to the secondary problem, presented in Algorithm 4. The main observer keeps verifying until the objective function meets an acceptable minimum. The details of the operation of each algorithm are the following: In Algorithm 1, the variable topo is the input. In Step 1, the flows are broken down, identifying the paths of each cluster. In Step 2, in the variable p, the SMs of each group are stored and the link capacities are put into the the matrix A. Next, in Step 3, an array of matrix A is initialized with an infinite value in those positions that comply with the constraint of link capacity. Later in Step 4, we generate the necessary matrices that will be input parameters in the revised simplex, which results in the initial feasible solution. Finally, in Step 5, we obtain the output variables.
In Algorithm 2, the data of the topology obtained in the primary problem is entered. As its output, the algorithm provides a new topology incorporating new SMs. Steps 1 and 2 execute the verification of the costs that are subject to the restrictions of the problem. If the SMs that are incorporated to the topology have costs within the allowed values, the algorithm leaves them, modifying the topology. The new data are delivered to the primary problem to solve and create new values of the objective function, to be later compared within the main observer.
Step 3 returns the output values. Step 5: Return obj, x, y, Base, topo Algorithm 2 Secondary problem algorithm: Ψ. Input: y, topo Output: Aggregate, topo Data: N = topo.N ; E = topo.i ; Aggregate = 0; e_cong = max(y) − N; [topo.i(e_cong), topo.j(e_cong)]; costB = topo.costlink; Step 1: for i = 1 : e_cong costB(topo.i(e_cong(i), topo.j(e_cong(j)) = in f costB(topo.j(e_cong(i), topo.i(e_cong(j)) = in f endfor In Algorithm 3, Step 1, the calculation of the distances is stored in a super matrix. Subsequently, the connectivity binary matrix G is constructed as follows: 1 is placed when a link exists; 0 is placed otherwise. The conditions for a link to exist are given by the radio coverage restrictions, for both cellular and WiFi technologies. In addition, the cost matrices for each link are generated. Finally, in Step 2, the output variables are returned.
In Algorithm 4, Step 1, we proceed to generate the multigraph that is derived from Algorithm 2, as well as the information contained in the topo variable. In Step 2, using the Dijkstra algorithm, we obtain the paths that integrate the routes to be physically plotted in the georeferenced map. Finally, in Step 3, the resulting data of the multigraph are delivered using the variables topo and con f .  Figure 2. Flowchart for the CMCF-AMI model or main observer CMCF-AMI.

Algorithm 3
Matrix generator: Υ. Input: topo, con f , radio_WiFi, radio_Cellular Output: dist, G, costDist, costlink, Cap Data: n = topo.N ; Step 1: for i = 1 : n for j = 1 : n dist(i, j) = haversine; if dist(i, j) ≤ (radio_WiFi||radio_Cellular); G(i, j) = 1; costDist(i, j) = dist(i, j); costlink(i, j) = con f . f actor(topo.tipo(i), topo.tipo(j)); topo, con f The CMCF-AMI can be considered a model that incorporates multiple flows and can redistribute the traffic according to the different topologies that have been delivered from the topology generation algorithm; also, the secondary problem eliminates congested links considering the capacity restrictions of the wireless link. In summary, the proposed model had the novelty of being a flexible scheme, which allowed incorporating new topologies or a partial mesh (when new smart meters were activated), together with the generation of additional traffic flows. In the next section, the solution is evaluated in real georeferenced environments and with a scalable population growth. The use of the column generation algorithm was justified when the number of variables to evaluate was considerable; besides, the heuristic achieved a better performance when the scenario was larger, a condition that previous works tended to relax by using small instances (i.e., less than 50 nodes) during the evaluation. In such cases, the global optimum could be reached from a linear programming problem and did not require the use of a heuristic. In this work, the use of the column generation heuristic facilitated the analysis with large populations, while employing acceptable computational times and providing flexibility in the addition of new restrictions that were relevant for the deployment of heterogeneous wireless networks.
The population of smart meters is related to the need to have smart metering by utilities. In this sense, the present work incorporated a growth rate in the analyzed scenarios, which could be modified according to the amount of gradual investment and the time for the deployment that a utility could have. The scenario had an area of 0.319 km 2 , with a population of 1024 homes, where the smart meters would be located; this scenario included approximately 34 homes for each neighborhood. Therefore, the experimental model had an incremental growth of 32 smart meters to show the growth that could be modified according to the requirements of the utilities and was determined by N = [32,64,96,128,160,192]; however, the scenario included a radial and non-sectorized deployment to achieve greater coverage for each wireless device. The scenario included some capabilities for each device with wireless technology determined by C = [10,20,100], which were equivalent to the capacity of a one-way smart meter, a smart meter that could behave like UDAP, and the base station or UDAP. The radio incorporated for a cellular base station radio or UDAP had a 100 meter range, and the WiFi radio had a 40 meter range. The OpenStreetMap site was used to obtain the OSM file of the scenario presented of Laureles neighborhood, Medellín, Colombia, and linked to Matlab R2019b. A PC with 64 GB of RAM and an Intel Xeon processor at 2.90 GHz was used.

Analysis of Results
In this section, we present the results generated by the CMCF-AMI heuristic model. Besides solving the problem of routing in a heterogeneous wireless network, we identify the paths taken by flows of information from the smart meters to the cellular base station. The scenario in the evaluation corresponded to a real map with conglomerates of N installed smart meters (see Figure 3 for an example of N = 192). The N value may change depending on the needs of the electric distribution company. The scenario considered a radial deployment in order to take advantage of the coverage radius of a typical cellular base station. We considered an initial topology with the deployment of both long range (e.g., cellular technology) and short range (e.g., WiFi) networks. As both coverage radii increased, the partial mesh topology and the possible routes through which the information was routed also increased. Figure 4 illustrates the evolution of the topologies. The initial topology for deployment is observed in gray, considering the coverage radius in meters for long and short range technologies, respectively. The figure also shows the resultant links from the feasible topology found for an optimal flow of information after the CMCF-AMI model was completed.
In Figure 5, one can see the occupation of each link with the flows of each of the routes that they took as optimal in the process of transmitting information from the smart meters to the cellular base station. In Figure 6, we illustrate the percentage representing the use of the links.     Table 4. The reference values of consumption in each wireless technology were taken from [47]. Additionally, Table 5 shows the growth of each wireless technology (WiFi) and the chargeability. It should be noted that the cost of the cellular technology has an important difference in relation to the WiFi technology. At first, the model may consider all SMs as candidates to connect directly through cellular technology, but this would represent a high cost. To improve such cost, the model increased the SMs connecting through WiFi technology while decreasing the number of SMs connecting directly through cellular technology; in this way, the cost for the use of the resources, in particular in terms of energy consumption, was reduced. Figure 8 shows the average number of SM that were connected to another SM taking the role of UDAP. Similarly, it shows the number of UDAPs implemented in each population of SM. In Table 6, we show the variation in % and the amount of the flow of the link in terms of both usage and occupation. This way, we could verify the changes between each SM population and the average of the flows on the links.        Finally, we provide an evaluation of the performance of the proposed CMCF-AMI heuristic. Table 7 lists the number of iterations required by the model to reach the desired solution for different populations of smart meters. The number of iterations of the first stage corresponded to the search by exploration according to the restriction of the maximum distance allowed as the radius of the cellular base station. The number of iterations of the first stage was the same for all population sizes since this process was done for each population and according to the population scale. The number of iterations of the second stage varied; but it was possible to appreciate that it did not have an exponential behavior. In fact, when the population of SMs became high (i.e., N > 128), the number of iterations tended to decrease. This was due to the nature of the column generation method, because the higher the population, the fewer iterations the method required, while at the same time, the efficiency improved (in relation to small populations).
The efficiency and performance of the model is presented in Figure 11. We illustrate the growth in the CPU time required for each SM population. The model formed by a main observer enclosed the global time of the model, while the primary problem for the generation of columns had a different time; additionally, the time of the secondary problem in charge of modifying the topology had a constant time due to its minimum incidence in the total time used for each population of SM. One can observe the time of the primary problem and the secondary problem in relation to the main observer, which entailed a longer time, since it contained the intelligence of the heuristic process. The calculation time of the CMCF-AMI based on the column generation algorithm as the population increased was reduced by 10%, and this was the expected return when the population was considerable. The result was evident when the population grew from 160 to 192 smart meters.

Conclusions
In this paper, we proposed a novel strategy for the deployment of smart metering infrastructure based on wireless heterogeneous networks. We considered the use of multi-hop links with short range wireless technologies in combination with wide area wireless networks to improve the connectivity of smart meters (SMs). The proposed strategy studied the capacitated multicommodity flow problem to formulate an optimized routing solution restricted by the link capacities.
The problem was in the category of NP-hard. Hence, it was decomposed into sub-problems to be solved in polynomial time, but still being NP-complete. We introduced the capacitated multi-commodity flow for advanced metering infrastructure (CMCF-AMI) heuristic, which found a suboptimal solution in polynomial time. It was demonstrated that the model was scalable since it incorporated different populations sizes of SMs and allowed the use of multiple wireless technologies. Furthermore, once the population size increased, the computation time of the heuristic decreased due to the particular characteristics of the column generation model. Our results showed that our strategy achieved an efficient solution. It minimized the cost of resources while guaranteeing a reliable network by eliminating the congested links for the final routing in the wireless networks. In addition, it minimized the cost of using a cellular connection by incorporating a low cost short range technology.
Although previous works also reduced the complexity of this problem in wireless heterogeneous networks, most of them only studied the use of clustering methods to form unbalanced conglomerates. In this work, we considered the reliability of the network in the problem design, in a way that the final topology and routing achieved a balance in the distribution of traffic flows over the less congested links. Therefore, we provided a planning and deployment strategy, based on a cross-layer solution, that exploited the advantages of the wireless heterogeneous network for advanced metering infrastructure.
Author Contributions: Conceptualization, methodology, software, resources, validation, and formal analysis, E.I. and and R.H.; investigation and writing, original draft preparation, E.I. and S.C.; writing, review and editing, and supervision, R.H. All authors have read and agreed to the published version of the manuscript.