A Deadlock-Free Deterministic–Adaptive Hybrid Routing Algorithm for Efficient Network-on-Chip Communication

Ji, Ning; Yang, Yintang

doi:10.3390/electronics14050845

Open AccessArticle

A Deadlock-Free Deterministic–Adaptive Hybrid Routing Algorithm for Efficient Network-on-Chip Communication

by

Ning Ji

and

Yintang Yang

^*

School of Microelectronics, Xidian University, Xi’an 710071, China

^*

Author to whom correspondence should be addressed.

Electronics 2025, 14(5), 845; https://doi.org/10.3390/electronics14050845

Submission received: 13 January 2025 / Revised: 31 January 2025 / Accepted: 19 February 2025 / Published: 21 February 2025

Download

Browse Figures

Versions Notes

Abstract

In the era of multi-core technology, efficient communication among numerous IP cores has become a critical challenge. Network-on-chip (NoC) technology provides a scalable and effective solution, attracting significant attention in academia and industry. This paper introduces a novel deterministic–adaptive hybrid routing (DAHR) algorithm designed to enhance performance while ensuring deadlock-free operation. The DAHR algorithm leverages pre-fetched deterministic information and real-time congestion feedback from neighboring nodes to make dynamic routing decisions. Before packet injection, the source–destination positional relationship and required hops are pre-calculated and encoded into the packet’s head flit. Routing decisions are then based on the availability of free virtual channels in the determined directions, eliminating the need for a complex routing calculation unit. Simulation results demonstrate that DAHR reduces average packet delay by at least 5.8% and improves saturation throughput by at least 9.0% compared to conventional routing schemes without introducing additional hardware overhead.

Keywords:

network-on-chip (NoC); hybrid routing; routing direction (RD); performance; hardware overhead

1. Introduction

As transistor increases are integrated into a single chip, the performance of single-core processor chips has encountered bottlenecks [1,2]. Consequently, multicore chips have emerged as the predominant solution [3]. Nowadays, processors designed for high-performance applications integrate various intellectual property (IP) cores onto a single chip. It is known as multiprocessor systems-on-chip (MPSoC) [4]. As more and more IP cores are integrated into MPSoCs [5], traditional bus architectures are inadequate to meet the performance demands of modern systems. Network-on-chip (NoC) has gradually become the preferred interconnection architecture for on-chip multi-core communication and has an increasing presence in research due to its scalability and efficient communication [6,7].

The performance of NoC is influenced by various factors, including its topology, microarchitecture, routing algorithms, and switching strategies. Among these elements, the routing algorithm represents a critical aspect of NoC design that significantly affects the network’s performance and power consumption [8]. Generally, routing algorithms can be categorized into three main types: deterministic, oblivious, and adaptive [9,10]. The deterministic algorithms select the same path and do not consider the status of the network. This characteristic provides advantages in terms of reduced complexity and lower overhead. On the other hand, deterministic routing offers a fixed and unchanging path between the determined source–destination nodes in the network. This method performs poorly under fluctuating or complex traffic conditions and cannot balance network load appropriately [11]. In oblivious routing algorithms, multiple paths can be selected between the source and destination nodes, but network congestion information is also not considered. The advantage of the oblivious algorithm is that it is simple to implement and easy to analyze while the disadvantage is that it requires locality and load-balancing tradeoffs. To achieve better performance, adaptive routing algorithms need to monitor network status and make routing decisions informed by congestion data, which unavoidably increases design complexity and hardware overhead [12].

A well-designed routing algorithm must effectively prevent deadlocks. Deadlocks typically arise from circular dependencies among data packets within a network. The presence of a complete circular can result in a deadlock [13]; therefore, it is essential to break this loop. The famous turning model [14] eliminates potential resource dependency loops by prohibiting certain specific turns, ensuring deadlock-free data transmission in the network. It can improve the stability and reliability of the network to some extent and is easy to implement and maintain. However, when there are changes in the network topology, it may be necessary to readjust or optimize the turning model to accommodate the new network environment. It increases the complexity and cost of network design. In addition, the input buffer can be divided into multiple independent queues, each corresponding to a virtual channel, and managed by an arbiter [15]. When a virtual channel is blocked, other virtual channels can continue to work and request their corresponding output channels. This method can effectively avoid deadlocks. Meanwhile, it mitigates head flit blocking and enhances both network performance and transmission efficiency. However, the hardware implementation is relatively complex.

To leverage the advantages of adaptive and deterministic routing, we propose a deterministic–adaptive hybrid routing (DAHR) algorithm based on record deterministic information in advance. During the packet forwarding process, the output port can be determined by considering the congestion information from a maximum of two adjacent nodes. The following is a list of this work’s contributions:

At the packet injection stage, all deterministic information is calculated, including the positional relationship of the source–destination nodes and hops in both horizontal and vertical directions. The hops and encoded positional relationship (defined as routing direction) are recorded in the head flit of the packet. This step is required only once.
The output port does not need to be repeatedly calculated during the packet forwarding and the routing calculation unit can be removed. A unique output port can be identified at each hop by decoding the critical information and assessing the free virtual channels (VCs) of alternative directions. The corresponding hops are updated with the packet forwarding. The DAHR algorithm is a combination of deterministic information (directions and hops) and adaptive information (available cache resources). The routing scheme achieves improved average packet latency and saturation throughput without hardware overhead increasing.

The subsequent sections of this paper are organized as follows. The related works are reviewed in Section 2. Section 3 describes the planned DAHR scheme. Section 4 presents DAHR’s performance evaluation. Lastly, Section 5 presents the conclusion.

2. Related Works

For deterministic routing, there is a single choice of paths and a simple algorithm for low overhead. The XY routing algorithm [16] is recognized as the most straightforward deterministic routing algorithm. The path determined by this algorithm relies on the source and destination nodes’ addresses, classifying it as a static routing algorithm. Pure zigzag routing [17] is also a typical deterministic routing. It alternates between each node’s horizontal and vertical routing directions. In addition, there is another type of zigzag routing [18]. The packets are transmitted along the direction of the farther distance until the two distances are equal. Subsequently, the packets switch between the X and Y directions until they arrive at their final destination. The RDRA (Recursive Deterministic Routing Algorithm) [19] is also a recursive deterministic routing algorithm for 2D mesh networks. It is enhanced by incorporating surrounding links (Torus), thereby achieving optimal performance in both time and spatial domains. The SCRA (Source-based Configuration Router Algorithm) [20] is a hybrid deterministic routing algorithm that combines the complementary properties of XY and YX routing algorithms to achieve optimal uniformity of network communication through a traffic distribution model. The DPRA (Deterministic-Path Routing Algorithm) [21] is a deterministic path routing algorithm designed for tolerating multiple faults on the wafer-level network chips, which avoids faulty nodes by generating and using routing tables and dynamically reconfiguring routes. It achieves deadlock-free communication while enhancing the number of available nodes and overall network performance.

Adaptive routing algorithms can be further categorized into partially adaptive and fully adaptive ones. In the case of partially adaptive routing, the local traffic conditions of neighboring routers are taken into account and analyzed to determine an appropriate output channel [22]. The turn model algorithm [14] is a typical adaptive routing algorithm that effectively prevents deadlocks by restricting specific types of turns. It performs better than deterministic routing. The OE (odd–even) turn model [23] is a widely recognized adaptive routing algorithm. It avoids deadlocks by implementing distinct steering restrictions across various columns. This model enhances the robustness of routing against non-uniform factors, such as hotspot traffic. Consequently, the network performance exhibits reduced fluctuations across varying traffic patterns compared to the previous turn model. OBL (Output Buffer Length) [24] utilizes the concept of Neighbors-on-path to select the optimal path from multiple feasible output channels to reduce congestion. This selection strategy is implemented in odd–even routing algorithms, resulting in enhancements to both average latency and saturation point. Modified XY [25] is an adaptive routing algorithm that couples with an on-demand buffer allocation concept. It exhibits improved performance in latency and throughput compared to XY routing. RTBAR (Region-based Traffic Balancing Adaptive Routing Algorithm) [26] is another partially adaptive routing. It estimates the congestion of all possible low-latency paths and effectively allocates network load, thereby reducing packet congestion and improving network performance compared to similar approaches. To further enhance the performance of NoC, the fully adaptive routing method utilizes non-local congestion information from neighboring routers to determine routing paths. RCA (Regional Congestion Awareness) [27] enhances adaptive routing algorithms by integrating congestion information from non-adjacent routers into the routing strategies rather than relying exclusively on local congestion data. It facilitates improved global load balancing in comparison to traditional adaptive routing methods. DAR (Destination-Based Adaptive Routing) [28] makes routing decisions by maintaining a delay estimate from each node to all other nodes, thereby avoiding estimation distortion caused by non-local link congestion. DBSS (Destination-Based Selection Strategy) [12] thoroughly evaluates adaptability, path selection strategies, and virtual channel allocation. It demonstrates considerable advantages in enhancing network performance while simultaneously reducing latency and congestion. GCA (Global Congestion Awareness) [29] calculates the path to the destination based on the global link state and congestion information by using simple low-complexity routing calculation units. The RCAR (Regional-based Congestion-aware Routing Algorithm) [30] is a fully adaptive routing algorithm that leverages congestion information bits generated by each node. It propagates this information only when a node experiences congestion, thereby reducing informational overhead. Additionally, it employs path diversity parameters to identify the optimal route from the source to the destination.

In addition to using a single routing algorithm, some researchers have also employed hybrid strategies. The most famous model is DYAD [31] which blends the benefits of adaptive and deterministic routing mechanisms and alternates between XY and OE algorithms based on network congestion conditions. FNADR (Fuzzy And Neural-based Adaptive And Deterministic Routing) [32] provided a better platform for designing the routing algorithm by eliminating ambiguities in the output port selection by using well-defined rules. SCRN (Scored Regional Congestion-aware and Neighbors-on-path) [33] is a hybrid selection technique based on traffic analysis to choose a superior output channel and thus enhance the NoC performance. PAAD (Partially Adaptive And Deterministic Routing) [10] divides the grid network into different diagonal regions and adopts different routing strategies based on the congestion situation in each region. When there is no congestion, deterministic routing is used, and when congestion occurs, partially adaptive routing is employed.

3. Proposed Deterministic–Adaptive Hybrid Routing Algorithm

This work proposes a unique deterministic–adaptive hybrid routing (DAHR) algorithm that prefetches critical information in advance. Due to its outstanding scalability and flexibility, we opted for a 2D mesh as the framework for implementing our algorithm.

3.1. Working Mechanism of DAHR

Critical information can be extracted in a 2D mesh network once a packet’s source and destination nodes are identified. Figure 1 provides an example to demonstrate the deterministic parameters further. Packet P₁ needs to be transmitted from node (3,2) to node (1,0). Due to the relatively simple logic of the shortest path algorithm and its ease of implementation in hardware, only the shortest path is considered. There are six paths to choose from. No matter which path is chosen, P₁ is sent in the west or south direction during the routing process and the hops along X and Y directions are also the same. Packet P₂ needs to be transmitted from node (0,0) to node (0,2) and both of them belong to Column 1. Only one path is available, with zero hops in the X direction and two hops in the Y direction. P₂ is routed northward at each hop. Packet P₃ needs to be transmitted from node (0,3) to node (3,3) and both of them belong to Row 3. There is also only one path to choose from with three hops in the X direction and zero hops in the Y direction. P₃ is routed eastward at each hop. P₁, P₂, and P₃ represent three typical packets. The analysis indicates that once the source and destination nodes are established, it is possible to determine the directional relationship between them and the corresponding hops in that direction at the packet injection stage. The determined information is crucial for our strategy.

After extracting deterministic information, we need to make a routing decision. When two routing directions are available, the direction with more available cache resources is selected. DAHR can be implemented in both VC and non-VC network architectures. In a routing structure that employs VCs, the alternative direction with more free VCs is selected. Conversely, the alternative direction with more free buffers is preferred in a routing structure that does not utilize VCs. The policy can balance the network load. Assuming that the router illustrated in Figure 1 operates using VCs, let us consider P₁ as an example. Firstly, after identifying the source and destination nodes, the route direction is recorded in the head flit—specifically indicating westward and southward movement—noting that there are two hops in both the X and Y directions. Subsequently, P₁ is injected into the network. Assuming that the number of input VCs to the west of node (3,2) exceeds those to the south, P₁ is initially routed to node (2,2). Consequently, this results in a decrease of one hop in the X direction. This process continues until both X and Y hops reach zero. At this juncture, it signifies that the packet has successfully arrived at its destination node (1,0), allowing it to be output from the local port of that destination node. Since the proposed algorithm uses both deterministic crucial information (positional relationship and the hops in the corresponding direction) and neighboring nodes’ congestion information (available cache resources), it can be defined as a deterministic–adaptive hybrid routing algorithm. Given that VCs can mitigate head flit blocking and enhance both network performance and transmission efficiency, this paper employs a router microarchitecture that incorporates virtual channels.

3.2. Packet Injection Scheme

At the packet injection stage, it is necessary to calculate the determined positional relationship and the corresponding hops. To document the positional relationship of the source–destination pairs in the network, we use an encoding approach. The necessary definitions are as follows.

Definition 1.

The routing direction (RD) is a binary code used to indicate the positional relationship of the source–destination nodes. The detailed coding method is illustrated in Figure 2. The east–west direction is represented by the X-axis and the north–south direction by the Y-axis. Whereas “1” indicates that the destination node is in the negative direction of the source node, “0” indicates that the destination node is in the positive direction of the source node.

Definition 2.

X_hop denotes the remaining hops in the horizontal direction of the packet, whereas Y_hop indicates the remaining hops in the vertical direction of the packet. During the process of packet forwarding, the values of X_hop and Y_hop are updated according to the prevailing forwarding situations.

According to Definition 1 and Definition 2, the positional relationship of the source–destination nodes can be clearly expressed using a maximum of 2 bits. In a 2D mesh, the information determined by (X_s, Y_s) (the source node) and (X_d, Y_d) (the destination node) is presented in Table 1. For P₁ in Figure 1, node (1,0) is southwest of node (3,2), so the RD of P₁ is encoded as “11”. At the packet injection stage, both X_hop and Y_hop are equal to two.

To accurately record the necessary information, it is imperative to revise the conventional packet structure. Typically, a routing packet usually consists of a tail flit, one or more body flits, and a head flit. Taking a 64-bit NoC as an example, the packet structure is illustrated in Figure 3. With 2 bits set aside for the flit type, 6 bits for the source node coordinates, 6 bits for the destination node coordinates, and 3 bits for VC choose information, it is presumed that the head flit has the 17 bits of information that are required. Consequently, there are 47 free bits reserved in the head flit. To effectively capture the determined information, we can utilize 8 of these reserved bits—allocating 2 bits for RD, 3 bits for X_hop (horizontal hop), and another 3 bits for Y_hop (vertical hop). Furthermore, the width of the parameters can be adjusted flexibly to accommodate networks of varying scales.

Once the destination of the injected packet is given, the relationship between the source node and the destination node can be determined. If the source node and the destination node are the same, both X_hop and Y_hop are zero. Otherwise, we need to calculate the determined information based on Table 1 and record it in the packet’s head flit. Then, the modified packet is injected into the network.

The DAHR algorithm can be extended to a 3D mesh. Compared to a 2D mesh, the positional relationships in the 3D mesh are more intricate. RD_Z and Z_hop need to be introduced. Therefore, the alternative directions for each packet are expanded to three. The packet structure needs to be modified based on the additional deterministic information. Moreover, more branches are required when implementing DAHR in a 3D mesh. Due to the consistent routing strategy employed in both 2D mesh and 3D mesh networks, the results in 2D mesh networks are sufficiently representative. Consequently, this paper focuses exclusively on the simulation and analysis of a 2D mesh.

3.3. DAHR Algorithm

The DAHR router’s microarchitecture includes input ports, output ports, virtual channels, virtual channel allocator, switch allocator, and crossbar, which is shown in Figure 4. Out of the five pairs of input/output ports, one is a local port and the other four are routing ports. Determining possible output directions at the packet injection stage is the main way that the DAHR differs from traditional routers, eliminating the need for repeated calculations at intermediate nodes. Thus, the routing calculation unit can be removed.

Once a packet is written to the input port of the DAHR router from the upstream router, the RD and hops recorded in the head flit are decoded. The unique output port is then determined according to the downstream VC occupation information carried by the credit. VA is implemented in parallel with SA in one cycle, and subsequently, the packet is sent to the next level of routing.

The DAHR algorithm’s pseudo-code is shown in Algorithm 1. RD and two-dimensional hops recorded in the head flit are retrieved when a packet (P) from an upstream router reaches the present router. Based on the information, packets are classified into four categories. In the first category, both X_hop and Y_hop are zero, signifying that the packet has arrived at its destination node and is being transmitted to its local port. In the second category, X_hop is zero while Y_hop is not zero, it indicates that both the current node and the destination node are situated in the same column. The output port is determined by RD_X. If RD_X equals zero, the packet is routed to the east port; otherwise, it is routed to the west port. In the third category, Y_hop is zero, while X_hop is not zero, identifying the current node and the destination node are in the same row. The output port is determined by RD_Y. If RD_Y is zero, the packet is output to the north port; otherwise, it is output to the south port. The fourth category represents a common scenario where neither X_hop nor Y_hop is zero. In this case, each packet carries 2 bits of RD. The final choice for routing is made by comparing the number of free VCs in these two output directions. Since X_hop and Y_hop are considered first in this algorithm, there is no need to update RD even if a packet is routed to the same row or column as its destination during the forwarding process.

Algorithm 1 DAHR Algorithm

/* P presents the packet from the upstream router;
N, S, W, E and L present the output port;
OP presents the output packet */
Input: P carried with RD, X_hop and Y_hop;
Output: OP carried with RD and updated X_hop or updated Y_hop;
1: if (X_hop = 0 & Y_hop = 0)         //the current node was the destination node
2:         L←P;
3: else if (X_hop = 0 &Y_hop ≠ 0) //the current node and the destination node are in the same column
4:    {       if (RD_Y = 0)
5:                 N←P;
6:          else
7:                  S←P;         }
8: else if (X_hop ≠ 0 &Y_hop = 0) //the current node and the destination node are in the same row
9:    {  if (RD_X = 0)
10:              E←P;
11:      else
12:              W←P;  }
13: else                   //X_hop ≠ 0 &Y_hop ≠ 0. The current node and destination node are neither in the same row nor the same column
14: {
15:         if ({ RD_X, RD_Y} = 00)
16:         {
17:                 if (E_free_VCs > N_free_VCs)
18:                        E←P;
19:                 else
20:                       N←P;
21:         }
22:             else if ({ RD_X, RD _Y} = 10)
23:         {
24:                 if (N_free_VCs > W_free_VCs)
25:                       N←P;
26:                 else
27:                       W ← P;
28:          }
29:          else if ({RD _X, RD _Y} = 11)
30:          {
31:                if (W_free_VCs > S_free_VCs)
32:                       W←P;
33:                else
34:                       S←P;
35:           }
36:                 else
37:           {
38:                  if (S_free_VCs > E_free_VCs)
39:                        S←P;
40:                  else
41:                        E←P;
42:           }
43: update X_hop or Y_hop of P  //If the packet is output along X direction, update X_hop in the head flit (X_hop = X_hop-1). If the packet is output along Y direction, update Y_hop in the head flit(Y_hop = Y_hop-1).
44: OP←P

Elimination of cyclic dependencies in channel dependency graphs is considered to be the key to deadlock-free algorithms [13]. According to the turn model theory [1], there are two categories of loops that may lead to a deadlock in a 2D mesh, as illustrated in Figure 5. Deadlocks can be avoided by limiting just one turn in each loop. The proposed DAHR follows a shortest path routing approach. Once the source and destination nodes are established, each packet is assigned a single RD value that remains constant throughout the entire routing process. For each RD, only two turns are allowed, as shown in Figure 6. It means that each packet is allowed up to two turns in the network. Additionally, a rotation priority strategy is employed to solve output port contention issues that may arise from packets being injected synchronously by different routers in the NoC. Consequently, the DAHR algorithm ensures freedom from circular channel dependency when selecting an output port, thus preventing deadlocks.

4. Simulation Results and Discussion

4.1. Simulation Settings

We utilized Noxim [34] to assess the performance of DAHR. Noxim is an open, configurable, scalable, and cycle-accurate NoC simulator developed in SystemC. Considering that the proposed DAHR uses both deterministic information and congestion information of adjacent nodes, we chose the most well-known deterministic XY routing and OE partially adaptive routing for comparison. Table 2 lists all of the simulation’s configuration parameters. Experimental simulations were conducted on 4 × 4 mesh and 8 × 8 2D mesh supported with wormhole switching. The length of the packet was set as 3~5 flits, and each flit’s width was set to 64 bits. The buffer of each input port had four VCs, each of which has five flit buffers.

Four synthetic traffic patterns were utilized to assess routing performance. The bit reversal mode performs a bit reversal operation on the source node’s ID to obtain the destination node’s ID. The IDs in NoC denote the unique designation utilized within the network to identify each node distinctly. They are typically assigned during the system initialization phase and remain constant throughout the network’s entire operation process. For example, in a 4 × 4 network, if the ID of the source node is one, its binary representation is “4′b0001”. Following the Bit reversal operation, we obtained “4′b1000”, which corresponds to the ID of the destination node, that is, Node 8. The Transpose rule represents the “diagonal symmetry” rule, where the destination and source nodes exhibit diagonal symmetry in the network matrix. The matrix consists of two diagonals, and the transpose operation is governed by two mapping rules. Assuming the network size is n × n, the spatial distribution of source and destination nodes under Transpose 1 and Transpose 2 can be expressed in Equations (1) and (2):

\{\begin{matrix} X_{d} = n - 1 - Y_{s} \\ Y_{d} = n - 1 - X_{s} \end{matrix}

(1)

\{\begin{matrix} X_{d} = Y_{s} \\ Y_{d} = X_{s} \end{matrix}

(2)

where (X_s, Y_s) denotes the coordinate of the source and (X_d, Y_d) denotes the coordinate of the destination node. In hotspot traffic mode, one or more nodes are set as hotspots, increasing the communication load on these nodes. Under synthetic traffic patterns, the NoC performance metrics of saturation throughput and average packet latency are assessed. Packet latency is defined as the number of clock cycles taken for a packet to be injected from the source node and forwarded to the destination node. However, analyzing a single packet cannot fully reflect the overall performance of the network. Therefore, the average packet latency is used to measure network performance [35]. The average packet latency refers to the mean value of all packet latency and is quantified in cycles. It can be expressed in Equation (3):

L_{a v e} = \frac{\sum_{i = 1}^{N} L_{i}}{N}

(3)

where L_ave denotes the average packet latency, L_i denotes the delay of the ith packet, and N denotes the total number of packets transmitted in the network.

Saturation throughput denotes the maximum data transmission rate that a NoC can sustain when the network load reaches its peak. This metric reflects the data transmission capability of NoC under extreme operational conditions and it has been defined in various forms. This paper uses the injection rate at which the average packet delay degrades to more than double the zero load delay as the saturation throughput [36].

4.2. Simulation Results

The average packet latency of three routing strategies in a 4 × 4 2D mesh network under four synthetic traffic patterns is shown in Figure 7. For the hotspot traffic pattern, four hotspot nodes (1,1), (2,1), (1,2) and (2,2) in the center of the network are taken, while the hotspot percentage is 10%. The results indicate that when the injection rate is below 0.03, the three algorithms’ average packet latency does not differ significantly. As the injection rate increases, the network load increases. It is observed that XY routing performs poorly due to its lack of consideration for congestion information, while DAHR outperforms the other two schemes by effectively optimizing the pipeline and balancing network load through its routing policy. At the injection rate of 0.1, the improved average packet latency of DAHR under four traffic patterns is listed in Table 3. Specifically, DAHR achieves an average packet latency of at least 13.9% reduction compared to XY and 8.6% reduction compared to OE.

Expanding the network scale to 8 × 8, Figure 8 displays the average packet latency of three routing strategies under four traffic patterns. For hotspot traffic patterns, nodes (2,2), (2,5), (5,2) and (5,5) in the center of the network are taken as hotspots. In contrast, the hotspot percentage is still 10%. As the network size increases, routing paths become longer, potentially leading to increased congestion. Packets utilizing traditional routing selection strategies may encounter blockages at upstream routers. DAHR is suited for larger networks as it effectively distributes the load among input VCs. Table 4 lists the improved average packet latency of DAHR under four traffic patterns at an injection rate of 0.1. The DAHR router has reduced average packet latency by at least 9.9% compared to XY and 5.8% compared to OE across four different traffic patterns.

The saturation throughput of three routing strategies at two distinct network scales is shown in Figure 9. The findings show that the suggested DAHR’s simple algorithm and use of congestion information allow it to attain the maximum saturation throughput. Table 5 presents the saturation throughput of three routing schemes. In a 4 × 4 mesh, DAHR’s saturation throughput increases by at least 18.0% and 9.0% compared to XY and OE schemes, respectively. Additionally, DAHR also demonstrates better performance in larger networks. In a 8 × 8 mesh, DAHR’s saturation throughput increases by at least 19.7% and 10.2% compared to XY and OE schemes, respectively.

The RTL (register transfer level) codes of the XY, OE, and suggested DAHR routers were constructed using Verilog to assess the hardware overhead in terms of layout area and power consumption. The TSMC 130 nm process library was employed at 500 MHz and 1.08 V, utilizing the Design Compiler from Synopsys to assess the layout area and power consumption of the three router schemes. Figure 10 shows the hardware overhead of the routers. Because of its simplicity, the XY router was found to have the lowest hardware overhead. The DAHR algorithm does not add further complexity to the circuit, as seen by the layout area and power consumption being comparable to OE. According to the simulation results, DAHR obtained improved performance without overhead increasing.

5. Conclusions

In Network-on-Chip (NoC) architectures, routing algorithms play a crucial role in network performance. In this paper, we propose a deadlock-free deterministic–adaptive hybrid routing algorithm that combines both deterministic information and adaptive information. The positional relationship between the source–destination nodes and horizontal and vertical hops is calculated before the packet injecting process; then, it is recorded in the head flit. The number of free virtual channels (VCs) at the downstream nodes can be compared when routing packets, and the output port can be determined by the recorded directional information. Simulation results demonstrate that the proposed DAHR reduces average packet latency by at least 9.9% and 5.8% and improves saturation throughput by at least 18.0% and 9.0% compared to traditional routing schemes in different network scales. Meanwhile, the routing algorithm enhances the routing performance without increasing overhead.

Author Contributions

Conceptualization, N.J. and Y.Y.; methodology, N.J. and Y.Y.; software, N.J.; validation, N.J.; formal analysis, N.J.; investigation, N.J.; resources, Y.Y.; data curation, N.J.; writing—original draft preparation, N.J.; writing—review and editing, Y.Y.; visualization, N.J.; supervision, Y.Y.; project administration, Y.Y.; funding acquisition, Y.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Ouyang, Y.; Zhang, T.; Li, J.; Liang, H. Fault-tolerant routing for reliable packet transmission in on-chip networks. Microelectron. J. 2024, 153, 106425. [Google Scholar] [CrossRef]
Xu, C.; Shi, X.; Yang, H.; Wang, Y. 3D network-on-chip data acquisition system mapping based on reinforcement learning and improved attention mechanism. Microelectron. J. 2024, 151, 106323. [Google Scholar] [CrossRef]
Niu, X.; Li, H.; Liu, F. A Loss-aware Continuous Hopfield Neural Network (CHNN)-based Mapping Algorithm in Optical Network-on-Chip (ONoC). In Proceedings of the 2022 20th International Conference on Optical Communications and Networks (ICOCN), Shenzhen, China, 12–15 August 2022; pp. 1–5. [Google Scholar]
Balakrishnan, M.T.; Venkatesh, T.G.; Bhaskar, A.V. Design and implementation of congestion aware router for network-on-chip. Integration 2023, 88, 43–57. [Google Scholar] [CrossRef]
Kaur, S.P.; Ghose, M.; Pathak, A.; Patole, R. A Survey on Mapping and Scheduling Techniques for 3D Network-on-chip. J. Syst. Architect. 2024, 147, 103064. [Google Scholar] [CrossRef]
Papaphilippou, P.; Chu, T.V. Efficient Deadlock Avoidance for 2-D Mesh NoCs That Use OQ or VOQ Routers. IEEE T. Comput. 2024, 73, 1414–1426. [Google Scholar] [CrossRef]
Gulzari, U.A.; Salcic, Z.; Farooq, W.; Anjum, S.; Khan, S.; Sajid, M.; Torres, F.S. Comparative analysis of 2D mesh topologies with additional communication links for on-chip networks. Compu. Netw. 2024, 241, 110193. [Google Scholar] [CrossRef]
Kiasari, A.E.; Jantsch, A.; Lu, Z. A Heuristic Framework for Designing and Exploring Deterministic Routing Algorithm for NoCs. In Routing Algorithms in Networks-on-Chip; Palesi, M., Daneshtalab, M., Eds.; Springer: New York, NY, USA, 2013; pp. 21–39. [Google Scholar]
Samman, F.A.; Hollstein, T.; Glesner, M. Runtime contention and bandwidth-aware adaptive routing selection strategies for networks-on-chip. IEEE T. Parallel Distrib. 2013, 24, 1411–1421. [Google Scholar] [CrossRef]
Misbah, M.; Roohie, N.M.; Najeeb-ud-din, H. PAAD (Partially adaptive and deterministic routing): A deadlock free congestion aware hybrid routing for 2D mesh network-on-chips. Microprocess. Microsyst. 2022, 92, 104551. [Google Scholar]
Zhou, X.; Liu, L.; Zhu, Z.; Zhou, D. A Routing Aggregation for Load Balancing Network-on-Chip. J. Circuit. Syst. Comp. 2015, 24, 1550137. [Google Scholar] [CrossRef]
Ma, S.; Jerger, N.E.; Wang, Z.; Lai, M.; Huang, L. Holistic Routing Algorithm Design to Support Workload Consolidation in NoCs. IEEE T. Comput. 2014, 63, 529–542. [Google Scholar]
Dally, W.J.; Seitz, G.L. Deadlock-Free Message Routing in Multiprocessor Interconnection Networks. IEEE Trans. Comput. 1987, 100, 547–553. [Google Scholar] [CrossRef]
Glass, C.J.; Ni, L.M. The turn model for adaptive routing. J. ACM 1994, 41, 874–902. [Google Scholar] [CrossRef]
Yu, Z.; Xiang, D.; Wang, X. Balancing Virtual Channel Utilization for Deadlock-Free Routing in Torus Networks. J. Supercomput. 2015, 71, 3094–3115. [Google Scholar] [CrossRef]
Tsai, W.C.; Lan, Y.C.; Hu, Y.H.; Chen, S.J. Networks on Chips: Structure and Design Methodologies. J. Electr. Comput. Eng. 2012, 2012, 509465. [Google Scholar] [CrossRef]
Sebastian, J.; Sharma, G. The Pure Zigzag Model for Routing in a NoC. In Proceedings of the 2012 International Conference on Computing, Electronics and Electrical Technologies (ICCEET), Nagercoil, India, 21–22 March 2012; pp. 922–926. [Google Scholar]
Pedro, V.; Eric, M.; Nan, W. ZigZag: An Efficient Deterministic Network-on-chip Routing Algorithm Design. In Proceedings of the 2017 8th IEEE Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON), Vancouver, BC, Canada, 3–5 October 2017; pp. 1–5. [Google Scholar]
Abdulelah, G.F.S. A Recursive Deterministic Routing Algorithm for Two Dimensional Mesh Network. Int. J. Comput. Sci. Mob. Comput. 2021, 10, 1–13. [Google Scholar]
Zhang, B.; Gu, H.; Guo, R. SCRA: A Hybrid Deterministic Routing Algorithm for Aging-Resilient Network-an-Chip. In Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), Lanzhou, China, 18–21 November 2019; pp. 1455–1458. [Google Scholar]
Chen, Z.; Zhang, Y.; Peng, Z.; Jiang, J. A Deterministic-Path Routing Algorithm for Tolerating Many Faults on Wafer-Level NoC. In Proceedings of the 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE), Florence, Italy, 25–29 March 2019; pp. 1337–1342. [Google Scholar]
Ebrahimi, M. Fully adaptive routing algorithms and region-based approaches for two-dimensional and three-dimensional networks-on-chip. IET. Comput. Digit. Tec. 2013, 7, 264–273. [Google Scholar] [CrossRef]
Chiu, G. The Odd-Even Turn Model for Adaptive Routing. IEEE T. Parall. Distr. 2000, 11, 729–738. [Google Scholar] [CrossRef]
Ascia, G.; Catania, V.; Palesi, M.; Patti, D. Implementation and Analysis of a New Selection Strategy for Adaptive Routing in Networks-on-Chip. IEEE Trans. Comput. 2008, 57, 809–820. [Google Scholar] [CrossRef]
Syeda, T.A.; Imran, M.M.; Jenia, N.M.; Jenia, A.J.; Chowdhury, Z.I.; Kaiser, M.S. An Adaptive Routing Algorithm for on-chip 2D Mesh Network with an Efficient Buffer Allocation Scheme. In Proceedings of the 2018 International Conference on Computer, Communication, Chemical, Material and Electronic Engineering (IC4ME2), Rajshahi, Bangladesh, 8–9 February 2018; pp. 1–4. [Google Scholar]
Nosrati, N.; Shahhoseini, H.S. Regional Selection Mechanism for Traffic-balanced Adaptive Routing Algorithms in Mesh-based NoC Architectures. In Proceedings of the 2020 10th International Conference on Computer and Knowledge Engineering (ICCKE), Mashhad, Iran, 29–30 October 2020; pp. 513–518. [Google Scholar]
Paul, G.; Boris, G.; Stephen, W.K. Regional Congestion Awareness for Load Balance in Networks-on-chip. In Proceedings of the 2008 IEEE 14th International Symposium on High Performance Computer Architecture, Salt Lake City, UT, USA, 16–20 February 2008; pp. 203–214. [Google Scholar]
Ramanujam, R.S.; Lin, B. Destination-based Adaptive Routing on 2D Mesh Networks. In Proceedings of the 6th ACM/IEEE Symposium on Architectures for Networking and Communications Systems, San Diego, CA, USA, 25–26 October 2010; pp. 1–12. [Google Scholar]
Ramakrishna, M.; Kodati, V.K.; Gratz, P.V.; Sprintson, A. GCA: Global Congestion Awareness for Load Balance in Networks-on-Chip. IEEE Trans. Parall. Distr. 2016, 27, 2022–2035. [Google Scholar] [CrossRef]
Seena, V.; Akram, R.; Midia, R. Low-cost Regional-based Congestion-aware Routing Algorithm for 2D Mesh NoC. Int. J. Commun. Syst. 2022, 36, e5360. [Google Scholar]
Hu, J.; Marculescu, R. DyAD—Smart routing for networks-on-chip. In Proceedings of the DAC04: The 41st Annual Design Automation Conference, San Diego, CA, USA, 7–11 June 2004; pp. 260–263. [Google Scholar]
Ashok Kumar, S.; Ashima, S. Fuzzy & Neural-based Adaptive & Deterministic Routing Algorithm for Network-on-chip. In Proceedings of the 2018 2nd International Conference on Inventive Systems and Control (ICISC), Coimbatore, India, 19–20 January 2018; pp. 575–579. [Google Scholar]
Mohammad, T.; Ali, M.N.G.M.; Fatemeh, G.; Pourya, P. A Hybrid Selection Strategy Based on Traffic Analysis for Improving Performance in Networks on Chip. J. Sens. 2022, 2022, 3112170. [Google Scholar]
Catania, V.; Mineo, A.; Monteleone, S.; Palesi, M.; Patti, D. Noxim: An Open, Extensible and Cycle-Accurate Network on Chip Simulator. In Proceedings of the 2015 IEEE 26th International Conference on Application-specific Systems, Architectures and Processors (ASAP), Toronto, ON, Canada, 27–29 July 2015; pp. 162–163. [Google Scholar]
Sheibanyrad, A.; Greiner, A.; Miro-Panades, I. Multisynchronous and fully asynchronous NoCs for GALS architectures. IEEE Des. Test. Comput. 2008, 6, 572–580. [Google Scholar] [CrossRef]
Shang, L.; Peh, L.; Jha, N.K. PowerHerd: A Distributed Scheme for Dynamically Satisfying Peak-Power Constraints in Interconnection Networks. IEEE Trans. Comput.-Aided Design Integr. Circ. Syst. 2006, 25, 92–110. [Google Scholar] [CrossRef]

Figure 1. Possible paths between source–destination nodes in 2D mesh.

Figure 2. Routing direction (RD).

Figure 3. The structure of the NoC packet.

Figure 4. Microarchitecture of the DAHR router.

Figure 5. Two categories of loops that may lead to a deadlock in a 2D mesh.

Figure 6. Different turns allowed of different RDs. (a) RD = 00; (b) RD = 10; (c) RD = 11; (d) RD = 01.

Figure 7. Average packet latency in 4 × 4 2D mesh. (a) Bit reversal; (b) Transpose 1; (c) Transpose 2; (d) Hotspot.

Figure 8. The average packet latency in 8 × 8 2D mesh. (a) Bit reversal; (b) Transpose 1; (c) Transpose 2; (d) Hotspot.

Figure 9. Saturation throughput in two sizes of networks. (a) Saturation throughput in 4 × 4 mesh; (b) Saturation throughput in 8 × 8 mesh.

Figure 10. Hardware overhead of three router schemes. (a) Layout area of routers; (b) Power consumption of routers.

Table 1. Information determined by source and destination nodes.

No.	Coordinate Relationship	X_Hop	Y_Hop	RD_X	RD_Y
1	X_d = X_s, Y_d = Y_s	0	0	invalid	invalid
2	X_d > X_s, Y_d > Y_s	X_d-X_s	Y_d-Y_s	0	0
3	X_d = X_s, Y_d > Y_s	0	Y_d-Y_s	invalid	0
4	X_d < X_s, Y_d > Y_s	X_s-X_d	Y_d-Y_s	1	0
5	X_d < X_s, Y_d = Y_s	X_s-X_d	0	1	invalid
6	X_d < X_s, Y_d < Y_s	X_s-X_d	Y_s-Y_d	1	1
7	X_d = X_s, Y_d < Y_s	0	Y_s-Y_d	invalid	1
8	X_d > X_s, Y_d < Y_s	X_d-X_s	Y_s-Y_d	0	1
9	X_d > X_s, Y_d = Y_s	X_d-X_s	0	0	invalid

Table 2. Simulation parameters.

Simulation Settings	Configuration
Network scale	4 × 4 and 8 × 8 (2D mesh)
Routing algorithms	XY, OE, DAHR
The width of each flit	64 bits
Length of packet	3~5 flits
Buffer size	5 (flits) × 4 (VCs)
Traffic patterns	Bit reversal, Transpose 1, Transpose 2, Hotspot

Table 3. Improved average packet latency of DAHR in 4 × 4 2D mesh under 4 traffic patterns.

	Bit Reversal	Transpose 1	Transpose 2	Hotspot
DAHR-XY	17.5%	13.9%	18.8%	14.8%
DAHR-OE	10.4%	8.6%	15.1%	8.7%

Table 4. Improved average packet latency of DAHR in 8 × 8 2D mesh under 4 traffic patterns.

	Bit Reversal	Transpose 1	Transpose 2	Hotspot
DAHR-XY	19.0%	11.9%	17.7%	9.9%
DAHR-OE	12.0%	7.6%	14.8%	5.8%

Table 5. Improved saturation throughput in 4 × 4 and 8 × 8 mesh under 4 traffic patterns.

		Bit Reversal	Transpose 1	Transpose 2	Hotspot
4 × 4 mesh	DAHR-XY	30.6%	36.3%	41.5%	18.0%
4 × 4 mesh	DAHR-OE	14.3%	9.0%	21.0%	9.0%
8 × 8 mesh	DAHR-XY	39.5%	35.3%	33.3%	19.7%
8 × 8 mesh	DAHR-OE	12.5%	16.9%	16.7%	10.2%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ji, N.; Yang, Y. A Deadlock-Free Deterministic–Adaptive Hybrid Routing Algorithm for Efficient Network-on-Chip Communication. Electronics 2025, 14, 845. https://doi.org/10.3390/electronics14050845

AMA Style

Ji N, Yang Y. A Deadlock-Free Deterministic–Adaptive Hybrid Routing Algorithm for Efficient Network-on-Chip Communication. Electronics. 2025; 14(5):845. https://doi.org/10.3390/electronics14050845

Chicago/Turabian Style

Ji, Ning, and Yintang Yang. 2025. "A Deadlock-Free Deterministic–Adaptive Hybrid Routing Algorithm for Efficient Network-on-Chip Communication" Electronics 14, no. 5: 845. https://doi.org/10.3390/electronics14050845

APA Style

Ji, N., & Yang, Y. (2025). A Deadlock-Free Deterministic–Adaptive Hybrid Routing Algorithm for Efficient Network-on-Chip Communication. Electronics, 14(5), 845. https://doi.org/10.3390/electronics14050845

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Deadlock-Free Deterministic–Adaptive Hybrid Routing Algorithm for Efficient Network-on-Chip Communication

Abstract

1. Introduction

2. Related Works

3. Proposed Deterministic–Adaptive Hybrid Routing Algorithm

3.1. Working Mechanism of DAHR

3.2. Packet Injection Scheme

3.3. DAHR Algorithm

4. Simulation Results and Discussion

4.1. Simulation Settings

4.2. Simulation Results

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI