The goal of this study is to establish an overlay network that supports all-to-all hop-by-hop multipath transmission. In this network, the underlay network performs shortest-path routing, while the overlay network performs hop-by-hop multipath routing. To ensure the performance of overlay multipath transmission, two key principles are employed in the design.
3.1. Overlay Node Placement
The underlay network is represented as , where V is the set of underlay routers and E is the set of underlay edges. Assume that the cost of each link is assigned with its latency and capacity, and the graph is connected. The overlay network is layered on top of the existing underlay network. Compared with traditional IP nodes, overlay nodes are capable of dynamically switching the next-hop overlay node, detecting local congestion, and functioning as access points for users and service providers. The goal of overlay node placement is to select a subset from the underlay node set V, and replace it with overlay nodes. Overlay nodes communicate through tunnels. Based on the adjacency relationships between nodes, an overlay topology can be constructed. The cost of constructing the overlay network is influenced by the number of deployed overlay nodes. Consequently, previous studies search for the minimum number of overlay nodes to meet the transmission requirements. In this study, the overlay network performs flexible hop-by-hop routing to meet all-to-all (rather than specific) transmission demands. Therefore, we assume that m overlay nodes are implemented to maximize the performance of multipath transmission on the overlay network. Theoretically, as the number of nodes increases, the benefit of multipath routing improves until it reaches an optimal point.
The placement of the overlay nodes influences the performance and efficiency of the overlay network. According to the max-flow min-cut theorem, the end-to-end maximum bandwidth is equal to the transmission bottlenecks, which are determined by the number of disjoint paths. Overlay routing can utilize multiple disjoint underlying paths, thus improving the transmission bandwidth and reliability. In order to reduce path overlap in overlay multipath routing and increase the utilization of underlying paths, we analyze the characteristics of the hop-by-hop multipath.
As the underlay network employs the shortest path routing algorithm, the underlying links used by an overlay network are the shortest path tree between the overlay nodes. Overlay nodes can alter the next-hop in hop-by-hop multipath routing, utilizing underlay links that are not in the shortest path tree rooted at the destination. By bypassing congested or failed paths, multipath routing enhances the overall transmission efficiency. As shown in
Figure 2, on the shortest path tree with the destination node as
, each node can follow the shortest path to reach the destination. Node A, as an overlay node, can dynamically switch to the next hop overlay nodes B, C, D, or E, with Node B as shortest path and C, D, E as backup. However, if tunnel (A, C), (A, D), or (A, E) overlap with tunnel (A, B) (as shown in
Figure 2b), the congestion in tunnel (A, B) cannot be completely bypassed, resulting in a transmission bottleneck. When overlay node A is replaced by node F, the overlap between tunnel (F, C), (F, D), (F, E) and (F, B) is reduced (as shown in
Figure 2c), making it a better choice for node placement. In order to improve the overall ability to bypass congested or failed paths, the overlap of the next-hop of each overlay node is considered.
Another factor is the occupation of the underlay links. Based on the research in literature [
9], we observe that the underlying links occupied by the overlay network are formed by a concatenation of shortest path trees with the overlay nodes as roots and their neighboring nodes as leaves. Obviously, gaining access to a larger number of underlay paths through a small number of overlay nodes can enhance the efficiency and potentially diversify the paths of overlay multipath routing.
Figure 3 illustrates an example of underlay link utilization.
Figure 3a shows the complete topology, while
Figure 3b–d represents the underlay links occupied by overlay nodes B, C, and D to their neighboring overlay nodes, respectively.
However, we find that these two metrics are not consistent. To reduce path overlap, nodes with high centrality are more likely to appear in the transmission path, which helps to decrease path overlap (as shown in
Figure 2). On the other hand, to increase the diversity of underlay paths, it is preferable to choose nodes where there is a discrepancy between the degree of the shortest path tree and the degree of the graph. (as shown in
Figure 3). To describe the ability of overlay nodes to bypass congested or failed paths, we study hop-by-hop path diversity
for overlay node
n. We define the underlay link utilized by the shortest paths of the overlay tunnel between node
x and
y as
Assuming the set of neighboring overlay nodes for node n is denoted as
, the set of underlay paths occupied by node
x and its neighbors is defined as
The overlap of the tunnel
with the tunnel between the node
n and its other neighbor node is defined as
The hop-by-hop path diversity
of overlay node
n is defined as the sum of independent paths, which can be calculated by Algorithm 1. The ultimate goal of enhancing
of overlay nodes is to increase the end-to-end path diversity for all-to-all multipath transmission. The end-to-end path diversity
is defined as the total number of independent paths obtained from hop-by-hop multipath routing. We believe that the local path diversity of overlay nodes determines the end-to-end transmission path diversity in the network.
Algorithm 1: Path Diversity Calculation |
![Electronics 14 02542 i001]() |
Theorem 1. The number of overlay nodes is m. If reaches the maximum, then the path diversity of all-to-all end-to-end transmissions also reaches its maximum.
Proof of Theorem 1. Assume the contrary, that does not reach its maximum, implying that the end-to-end path diversity is not maximized. If is not maximized for some node n, it indicates that there are redundant or insufficient independent paths between node n and its neighbors. Thus, the set of underlay paths , occupied by node n and its neighbors, will also be suboptimal. This results in a decrease in the total path diversity of the network. A reduction in leads to fewer independent paths, which in turn reduces the end-to-end path diversity.
Therefore, if does not reach its maximum, the end-to-end path diversity cannot be maximized, leading to a contradiction. Hence, if reaches its maximum, the end-to-end path diversity of all overlay transmissions must also reach its maximum. □
In addition to the path diversity between overlay nodes, user access should also be considered. Clustering of overlay nodes leads to inefficiency in serving a large number of underlay users. This study measures the scale of the overlay multipath service by the number of utilized underlay links, with fewer links indicating a higher concentration of overlay nodes. Let
be the set of selected overlay nodes. The multipath overlay node placement (MOP) problem is formulated as follows:
In the above optimization problem, the first term maximizes the overall path diversity in the overlay network. The second term introduces the number of underlying links to promote sparsity in the selected nodes. is the coefficient of , which determines the weight of link utilization and path diversity in the objective function. The constraint (4) limits the number of overlay nodes. In practice, the locations of some service-providing nodes, such as data centers, are predetermined. In such cases, the positions of the rest nodes are calculated based on fixed location of predetermined nodes in the optimization problem.
The problem can be reduced to the Maximum Coverage Problem [
29], which has been proven to be NP-hard. To obtain a near-optimal solution in a reasonable time frame, this study adopts a genetic algorithm [
30] to solve the above problem. In each generation, the next population is obtained through crossover and mutation, ultimately resulting in an optimal combination of overlay nodes. The complexity of this algorithm primarily depends on the population size
P, the overlay node number
L, and the number of generations
G. The overall time complexity is
and the space complexity is
.
The overlay node placement problem discussed above aims to maximize overall benefit with a fixed number of nodes. In practice, however, there are scenarios in which the number of overlay nodes is chosen based on the average benefits of the overlay nodes or other specific optimization objective [
9]. In the context of this study, the objective is to maximize the path diversity of overlay hop-by-hop multipath routing. Define the objective function
In the MOP problem described above, once the number of overlay nodes
m exceeds a certain threshold, additional nodes contribute only marginal improvements to multipath transmission performance. For both underlay link utilization and overall hop-by-hop path diversity, the marginal gains diminish as
m increases, suggesting that the first derivatives of these metrics with respect to the number of overlay nodes exhibit convexity. However, when underlay link utilization and hop-by-hop path diversity are combined into a single objective function
, differences in their respective growth behaviors may lead to the presence of local optima. To assess the effectiveness of increasing the number of overlay nodes, we define the optimal number as the smallest value of
m for which the marginal improvement falls below a predefined threshold
, i.e.,
Given that the selection of
is topology-dependent and cannot be expressed analytically, the optimal number of overlay nodes is determined via a search-based method. Let
i denote the step size for incrementing the number of overlay nodes. The MOP algorithm evaluates
F at each interval, and the derivative is approximated using the finite difference:
When , the corresponding value x is selected as the optimal number of overlay nodes. If two candidate values of x satisfy the condition, the larger (right-hand side) value is chosen to ensure sufficient coverage of performance gains.
3.2. Overlay Hop-by-Hop Routing
To address the congestion issues caused by large-scale data transfers and increase the efficiency of the network layer, we propose a best-effort approach where both load and cost are taken into account. In the hop-by-hop multipath routing algorithm, the cost of the tunnels is determined by latency, which is assumed to be relatively stable. Intermediate nodes can flexibly adjust paths by switching to an alternative next hop. During path selection, tunnels with lower costs are prioritized to reduce transmission costs. When congestion or failures occur, intermediate overlay nodes switch to paths with higher costs but a lower degree of congestion. The existing hop-by-hop multipath routing schemes ensure transmission reachability through LFI or DAG. In overlay networks, tunnels with different endpoints may have overlapping underlay links, which fail to bypass transmission bottlenecks.
In the previous section, the overlay node placement method partially reduces path overlap by maximizing the path diversity of overlay nodes. However, path overlap cannot be eliminated with a limited number of overlay nodes. Path overlap in dynamic multipath routing can easily lead to a decrease in transmission efficiency. Examples are illustrated in
Figure 4. A flow from source node A to destination node F has two paths: (A, B, C, F) and (A, B, D, E, C, F). Node A has node F and node D as its next hops. In the next hop selection, tunnel (A, F) and tunnel (A, D) overlap at the underlay link (A, B). The tunnel with the lower cost is (A, F). If link (A, B) is congested, switching the path to (A, D) cannot alleviate congestion. However, if link (B, C) becomes the bottleneck, tunnel (A, D) can bypass congestion. Another scenario involves multi-hop path selection. In this case, the link (C, F) is congested. Although tunnel (A, D) is not congested, switching paths is unable to alleviate congestion while degrading the performance.
We address the issue by measuring the load of the adjacent tunnels at every overlay node. The Discounting Rate Estimator (DRE) [
31] is a low-overhead measurement method for high-speed traffic, which is widely applied in data center networks and wide-area networks. It maintains a register that increases with the arrival of each packet and periodically (every
T) decreases according to a factor
. The value of the register is proportional to the transmission rate, with a coefficient of
. In the following discussion, it is assumed that the tunnel load is obtained through DRE and represented in a normalized form.
During the routing, if the load of path (A, F) is heavier than the load of path (A, D, E, F), it means that two paths do not share the bottlenecks and path (A, D, E, F) is an alternative path. If both tunnel (A, F) and tunnel (A,D) are overloaded, congestion occurs at the overlapping link (A, B) and switching paths to (A, D) is invalid. And when the tunnel (A, D) is not overloaded, it is unclear whether the link (B, C) or link (C, F) is congested. Algorithm 2 presents the single-hop path selection Algorithm. The algorithm makes decisions at each node based on local load information, selecting non-congested nodes with the lowest cost from the next-hop set.
Algorithm 2: Single-hop path selection |
![Electronics 14 02542 i002]() |
While single-hop congestion is easy to identify, multi-hop path selection needs global coordination. The goal of multi-hop path selection is to avoid transmitting data to congested areas and to filter the paths that cannot bypass congestion. During this process, network resources are effectively allocated to transmissions that are unlikely to cause congestion. This method determines congestion based on the load of the links and removes the next hops that lead to unavoidable bottlenecks. The load information of the tunnel is periodically uploaded to the controller, and the controller updates the next-hop selection results based on the calculated data by Algorithm 3. The excluded next-hops are marked as congested. According to the approach in Algorithm 2, congested next hops have lower priority, which leads data towards non-congested paths and helps to effectively allocate network resources. The complexity of the algorithm for each destination is , and the complexity for all-to-all transmission is , where m is the number of overlay nodes and k is the number of overlay tunnels.
We place the centralized controller at the location closest on average to all nodes. This controller periodically gathers load information of the tunnel from the network. The control delay introduced by this periodic data collection is minimal. Firstly, the periodic approach allows the system to distribute control operations over time, reducing the need for constant adjustments and thereby minimizing latency. Secondly, by strategically positioning the controller near the average center of all nodes, communication latency is further reduced. Furthermore, as observed from the experiments, the impact of Algorithm 3 on the number of paths is significantly smaller than that of Algorithm 2. Therefore, we believe that implementing Algorithm 3 on a centralized controller using a periodic method is appropriate.
Algorithm 3: Multi-hop path selection |
![Electronics 14 02542 i003]() |