Design of Identical Strictly and Rearrangeably Nonblocking Folded Clos Networks with Equally Sized Square Crossbars

Yamin Li

doi:10.3390/computers14070293

Computer Architecture Laboratory, Department of Computer Science, Faculty of Computer and Information Sciences, Hosei University, Tokyo 184-8584, Japan

Computers2025, 14(7), 293;https://doi.org/10.3390/computers14070293

Version Notes

Order Reprints

Abstract

Clos networks and their folded versions, fat trees, are widely adopted in interconnection network designs for data centers and supercomputers. There are two main types of Clos networks: strictly nonblocking Clos networks and rearrangeably nonblocking Clos networks. Strictly nonblocking Clos networks can connect an idle input to an idle output without interfering with existing connections. Rearrangeably nonblocking Clos networks can connect an idle input to an idle output with rearrangements of existing connections. Traditional strictly nonblocking Clos networks have two drawbacks. One drawback is the use of crossbars with different numbers of input and output ports, whereas the currently available switches are square crossbars with the same number of input and output ports. Another drawback is that every connection goes through a fixed number of stages, increasing the length of the communication path. A drawback of traditional fat trees is that the root stage uses differently sized crossbar switches than the other stages. To solve these problems, this paper proposes an Identical Strictly NonBlocking folded Clos (ISNBC) network that uses equally sized square crossbars for all switches. Correspondingly, this paper also proposes an Identical Rearrangeably NonBlocking folded Clos (IRNBC) network. Both ISNBC and IRNBC networks can have any number of stages, can use equally sized square crossbars with no unused switch ports, and can utilize shortcut connections to reduce communication path lengths. Moreover, both ISNBC and IRNBC networks have a lower switch crosspoint cost ratio relative to a single crossbar than their corresponding traditional Clos networks. Specifically, ISNBC networks use 46.43% to 87.71% crosspoints of traditional strictly nonblocking folded Clos networks, and IRNBC networks use 53.85% to 60.00% crosspoints of traditional rearrangeably nonblocking folded Clos networks.

Keywords:

multistage interconnection networks; strictly nonblocking; rearrangeably nonblocking; Clos network; folded Clos network; fat tree; k-ary n-tree; square crossbar switch; crosspoint ratio

1. Introduction

Clos networks [1] and fat trees [2] have been widely used in interconnection network designs for modern data centers and supercomputers such as Google Jupiter [3], Tianhe [4], TaihuLight [5], Frontera [6], Summit and Sierra [7], and the Multi-Plane Fat Tree for DeepSeek-V3 [8].

The traditional unidirectional nonblocking Clos network was originally designed for telecommunications. It is a type of multistage circuit-switching network that replaces a single large crossbar to reduce hardware costs in terms of crosspoints.

A three-stage traditional unidirectional Clos network topology [1] is parameterized by two integers (n and m, where n is the number of sources connecting to an ingress-stage crossbar switch or the number of destinations connecting to an egress-stage crossbar switch and m is the number of crossbar switches in the middle stage). The ingress stage has n crossbar switches, and the number of crossbar switches in the egress stage is also n. Therefore, the total number of sources is

n^{2}

, and the total number of destinations is also

n^{2}

. A switch in the ingress stage is an

n \times m

(n inputs and m outputs) crossbar. A switch in the egress stage is an

m \times n

crossbar. A switch in the middle stage is an

n \times n

crossbar. There is exactly one connection between each ingress-stage switch and each middle-stage switch, and there is exactly one connection between each middle-stage switch and each egress-stage switch.

m \geq 2 n - 1

, indicates a strictly nonblocking network, meaning that the network can connect a free source to a free destination without interfering with existing connections.

m \geq n

indicates a rearrangeably nonblocking network, meaning that the network can connect a free source to a free destination with rearrangements of existing connections.

Traditional unidirectional strictly nonblocking Clos networks have two drawbacks. One drawback is that all connections from sources to destinations pass through a fixed number of stages. The source is connected to the switch in the ingress stage, and the destination is connected to the switch in the egress stage. Let s be the number of stages of a Clos network. Then, the path length from the ingress stage to the egress stage is

s - 1

. Counting the path from the source to the ingress-stage switch and the path from the egress-stage switch to the destination, the path length is

1 + (s - 1) + 1 = s + 1

. For example, the path length in a three-stage unidirectional Clos network is always 4. It is not possible to exploit the locality of source–destination pairs. Another drawback is that it uses

n \times m

and

m \times n

crossbars with different numbers of input and output ports. However, nowadays, the available switches are square crossbars with the same number of input and output ports. If the same square crossbars are used for all switches, there will be many unused ports.

A fat tree is a folded version of a Clos network [9]. It merges the corresponding ingress and egress switches. Then, the merged stage is called a leaf stage, and the middle stage is called a root stage. A fat tree can utilize shortcut connections, that is, a connection does not have to go through all stages. For example, if the source and destination are connected to the same leaf switch, the connection does not need to go to the root stage. Thus, the path length is 2 instead of 4. A drawback of fat-tree networks is that the root stage uses differently sized crossbar switches than the other stages.

A k-ary n tree [10] is a kind of parametric fat tree where k is the arity or number of links of a switch that connect to the previous or next stage and n is the number of stages, that is, the switch radix is

2 k

. A k-ary n-tree Clos network can be constructed with two back-to-back k-ary n-fly butterflies [11]. A 2-ary n-tree Clos network is also called a Beneš network [12]. A k-ary fat tree [13] is a bidirectional (

k / 2

)-ary n-tree Clos network where k is even.

A packet can be routed to an arbitrary middle switch in a rearrangeably nonblocking Clos network or an arbitrary root switch in a rearrangeably nonblocking folded Clos network, then to its ultimate destination. This increases hardware cost and packet latency. Mirrored k-ary n-tree networks [14] and peer k-ary n-tree networks [15] focus on increasing network capacity and reducing hardware costs and packet latency.

A strictly nonblocking folded Clos network using same-sized crossbar switches was proposed in [16]. The proposed network has a two-stage structure and uses multiple links between a leaf switch and a root switch. It may have unused switch ports. The number of unused switch ports is reduced by adjusting the number of leaf switches, but the existence of unused switch ports increases the hardware cost. A flexible folded Clos network was proposed in [17]. To reduce the blocking probability, a second group of switches is added to the root stage. Ref. [18] extended the number of groups from two to a general number (S). All these networks have only two stages, making it difficult to scale the network.

The existing issues are summarized below. Traditional unidirectional strictly nonblocking Clos networks use switches with different numbers of input and output ports and route packets through a fixed number of stages. Fat trees, the folded version of Clos networks, use differently sized switches at the root and other stages. Recently proposed Clos networks have only two stages and introduce unused ports on the switches. This paper attempts to solve these problems with a low crosspoint ratio relative to a single crossbar.

The contributions of this paper are summarized as follows. An Identical Strictly NonBlocking folded Clos (ISNBC) network and an Identical Rearrangeably NonBlocking folded Clos (IRNBC) network are proposed. Both ISNBC and IRNBC networks can have any number of stages to increase the system’s scalability, can use equally sized square crossbars with no unused switch ports to accommodate currently available switches at low costs, and can utilize shortcut connections to reduce communication path lengths. Moreover, both ISNBC and IRNBC networks have a lower crosspoint ratio relative to a single crossbar than their corresponding traditional nonblocking Clos networks.

The rest of this paper is organized as follows. Section 2 reviews some related multistage interconnection networks. Section 3 proposes identical strictly and rearrangeably nonblocking folded Clos networks consisting of equally sized square crossbars. Section 4 evaluates the hardware cost from the crosspoint perspective and shows that the costs of the proposed networks are lower than those of their corresponding traditional networks. Finally, Section 5 concludes the paper and suggests some future research topics.

3. Proposed Identical Nonblocking Folded Clos Networks

This section first presents Unidirectional Strictly NonBlocking Clos (USNBC) networks and Unidirectional Rearrangeably NonBlocking Clos (URNBC) networks. Based on these unidirectional networks, the construction methods of identical strictly nonblocking folded Clos (ISNBC) networks and identical rearrangeably nonblocking folded Clos (IRNBC) networks, as listed in Table 1, are presented. Note that the number of stages in unidirectional networks is odd, e.g.,

2 s - 1 = 3, 5, 7

for

s = 2, 3, 4

, where s is the number of stages for corresponding identical folded networks. Here, only the cases of

s = 2

, 3, and 4 are shown, but for

s > 4

, USNBC, URNBC, ISNBC, and IRNBC networks can be constructed similarly.

Table 1. Proposed nonblocking Clos networks.

3.1. Proposed Identical Strictly NonBlocking Folded Clos (ISNBC) Networks

As mentioned before, there are two drawbacks of traditional strictly nonblocking Clos networks. One is the use of

n \times m

and

m \times n

crossbars with unequal numbers of input and output ports. Another is that all connections pass through a fixed number of stages. A drawback of traditional nonblocking folded Clos networks is that the root stage uses differently sized crossbar switches than the other stages. These problems can be solved by using an ISNBC network consisting of equally sized square crossbars. This subsection presents USNBC networks and the corresponding ISNBC networks.

3.1.1. Two-Stage ISNBC Networks

To construct a two-stage ISNBC network, a three-stage USNBC network is constructed first. In a three-stage USNBC network, let n be the number of inputs per switch in the ingress stage, m be the number of switches in the middle stage, and r be the number of switches in the ingress and egress stages. Let

m = 2 n

to ensure the strictly nonblocking property. Let

r = n + m = 3 n

to ensure that the ISNBC network uses equally sized square crossbars.

In the ingress stage, there are r switches, and each switch is an

n \times m

crossbar (n inputs and m outputs); in the middle stage, there are m switches, and each switch is an

r \times r

crossbar; and in the egress stage, there are r switches, and each switch is an

m \times n

crossbar. Each output of a switch in the ingress stage is connected to an input of a different switch in the middle stage. Since there are r switches in the ingress stage, the number of inputs of a switch in the middle stage must be r—one input for an output of the r switches in the ingress stage. Each output of a switch in the middle stage is connected to an input of a different switch in the egress stage. Because the number of outputs of a switch in the middle stage is also r, each output is connected to an input of r switches in the egress stage. Then, the number of compute nodes is

N = n r = n \times 3 n = 3 n^{2}

.

In summary, to construct a three-stage USNBC network, m and r are determined as shown in Formula (1), where N is the number of compute nodes.

\begin{matrix} \{\begin{matrix} m & = 2 n \\ r & = n + m = 3 n \dots \dots (3-stage USNBC) \\ N & = n r = 3 n^{2} \end{matrix} \end{matrix}

(1)

To construct a folded version of a three-stage USNBC network, the corresponding position switches in the ingress and egress stages are merged and expanded so that there are

n + m

inputs and

m + n

outputs in the combined switch. Then, these switches have the same number of inputs and outputs, which is

n + m

. The switch in the middle stage has r inputs and r outputs with

r = n + m

. Therefore, all switches in the folded Clos network use equally sized

(n + m) \times (m + n) = 3 n \times 3 n

crossbars.

Figure 1a shows a three-stage USNBC network with

n = 2

,

m = 2 n = 4

, and

r = n + m = 6

. It has

N = n r = 3 n^{2} = 12

compute nodes. A two-stage ISNBC network, the folded version of the three-stage USNBC network with

n = 2

,

m = 2 n = 4

, and

r = n + m = 6

, is shown in Figure 1b. It uses equally sized

(n + m) \times (m + n) = 6 \times 6

square crossbars for all switches. It can utilize shortcut connections to reduce communication path lengths. For example, if the source and destination nodes are connected to the same leaf switch, the communication does not need to go through the root switch.

Figure 1. Proposed strictly nonblocking Clos networks (

m = 2 n

and

r = n + m

). (a) A 3-stage USNBC network with

n = 2

,

m = 2 n = 4

, and

r = n + m = 6

. (b) A 2-stage ISNBC network composed of equally sized

(n + m) \times (m + n) = 6 \times 6

square crossbars.

The root switch in Figure 1b and the middle switch in Figure 1a are the same — both are the

r \times r

crossbar switches. However, the leaf switch in Figure 1b is not simply a combination of the

n \times m

ingress switch and the

m \times n

egress switch in Figure 1a. A leaf switch is a square

(n + m) \times (m + n)

crossbar.

Referring to Figure 2, for

n = 2

and

m = 2 n = 4

, Figure 2a shows an

n \times m = 2 \times 4

crossbar and an

m \times n = 4 \times 2

crossbar. The number of crosspoints is

n \times m + m \times n = 4 n^{2} = 16

. There are no paths from inputs

x_{0}

and

x_{1}

to outputs

y_{4}

and

y_{5}

. Similarly, there are no paths from inputs

x_{2}

,

x_{3}

,

x_{4}

, and

x_{5}

to outputs

y_{0}

,

y_{1}

,

y_{2}

, and

y_{3}

. Figure 2b shows an

(n + m) \times (m + n) = 6 \times 6

crossbar. The number of crosspoints is

(n + m) \times (m + n) = 9 n^{2} = 36

. Any input (

x_{i}

) can be routed to any output (

y_{j}

) for

i, j = 0, 1, \dots, 5

. A crosspoint can be implemented using two two-to-one multiplexers, as shown in Figure 2c.

Figure 2. Merging and expanding an

n \times m

crossbar switch and an

m \times n

crossbar switch to a big square

(n + m) \times (m + n)

crossbar switch for

n = 2

and

m = 2 n = 4

. (a) An

n \times m

crossbar switch and an

m \times n

crossbar switch, each with 8 crosspoints. (b) A square

(n + m) \times (m + n)

crossbar switch with 36 crosspoints. The red line shows the path of

x_{0} \to y_{1}

. (c) Crosspoint states and implementation using two 2-to-1 multiplexers.

Figure 2b is a

6 \times 6

crossbar switch. The input port (

x_{i}

) and output port (

y_{i}

) form a bidirectional port (

p_{i}

, where

i = 0, 1, \dots, 5

). An example of a shortcut is shown in Figure 3, where

p_{0}

and

p_{1}

connect to compute nodes 1 and 2, respectively. Then, node 1 can send packets to node 2 only through leaf switch 1, thereby reducing the communication path length.

Figure 3. A shortcut in an ISNBC network. When a packet is sent from a source (node 1) to a destination (node 2), it does not need to go through the root switch.

Figure 4 shows another three-stage USNBC network with

n = 3

,

m = 2 n = 6

, and

r = n + m = 9

. There are

r = 9

ingress switches,

r = 9

egress switches, and

m = 6

middle switches. It has

N = n r = 27

compute nodes.

Figure 4. A 3-stage USNBC network with

n = 3

,

m = 2 n = 6

, and

r = n + m = 9

.

A two-stage ISNBC network, the folded version of the three-stage USNBC network with

n = 3

,

m = 2 n = 6

, and

r = n + m = 9

, is shown in Figure 5. It uses equally sized

3 n \times 3 n = 9 \times 9

square crossbars for all switches. It can utilize shortcut connections to reduce communication path lengths.

Figure 5. A 2-stage ISNBC network with

n = 3

,

m = 2 n = 6

, and

r = n + m = 9

composed of equally sized

(n + m) \times (m + n) = 9 \times 9

square crossbar switches.

Table 2 and Table 3 show the differences between this work and existing original work. Table 2 compares the proposed three-stage USNBC network with a traditional three-stage unidirectional strictly nonblocking Clos network. Table 3 compares the proposed two-stage ISNBC network with a traditional two-stage strictly nonblocking folded Clos network. The

[x \times y] \times z

expression in the switch column indicates that there are z switches, and each switch is an

x \times y

crossbar. N is the total number of compute nodes in the network. The ISNBC network uses equally sized square crossbar switches in the leaf and root stages.

Table 2. Comparison of three-stage unidirectional strictly nonblocking Clos networks.

Table 3. Comparison of two-stage strictly nonblocking folded Clos networks.

3.1.2. Three-Stage ISNBC Networks

To construct a three-stage ISNBC network, a five-stage USNBC network is constructed first. By using three-stage USNBC networks as building blocks, a five-stage USNBC network can be constructed. For

m = 2 n

, a three-stage USNBC network has

N = (n + m) n = 3 n^{2}

compute nodes. As a building block, the compute nodes are removed so that the three-stage USNBC network has

(n + m) n = 3 n^{2}

inputs and

(n + m) n = 3 n^{2}

outputs.

The building blocks can be thought of as virtually

3 n^{2} \times 3 n^{2}

crossbars. m such building blocks are arranged in the middle stage. Then, in total, there are

3 n^{2} \times m = 6 n^{3}

inputs and

3 n^{2} \times m = 6 n^{3}

outputs in the middle stage. Correspondingly, the same number of outputs in the ingress stage and the same number of inputs in the egress stage can be arranged. Let r be the number of switches in the ingress and egress stages; then,

r m

must be equal to

6 n^{3}

. Therefore,

r = 6 n^{3} / m = 3 n^{2} \times m / m = 3 n^{2} = (n + m) n

.

In summary, to construct a five-stage USNBC network, given n as the number of inputs per switch in the ingress stage, there are

r = (n + m) n = 3 n^{2}

switches in the ingress stage, and each switch is an

n \times m

crossbar with

m = 2 n

. The middle stage has m building blocks, and each building block is a three-stage USNBC network with compute nodes removed. The egress stage has

r = (n + m) n = 3 n^{2}

switches, and each switch is an

m \times n

crossbar with

m = 2 n

, as shown in Formula (2). The linking method is similar to the three-stage USNBC network; each output of a switch in the ingress stage is connected to an input of a different building block in the middle stage. Each output of a building block in the middle stage is connected to an input of a different switch in the egress stage. Because

r = 3 n^{2}

, the number of compute nodes is

N = n r = 3 n^{3}

.

\begin{matrix} \{\begin{matrix} m & = 2 n \\ r & = (n + m) n = 3 n^{2} \dots \dots (5-stage USNBC) \\ N & = n r = 3 n^{3} \end{matrix} \end{matrix}

(2)

Figure 6 shows a five-stage USNBC network with

n = 2

,

m = 2 n = 4

, and

r = 3 n^{2} = 12

. It has

N = n r = 3 n^{3} = 24

compute nodes. There are

m = 2 n = 4

building blocks in the middle stage, and each building block is a three-stage USNBC network with compute nodes removed. The detailed network of a building block is shown at the bottom of the figure. It can be seen from the figure how the switches are linked together.

Figure 6. A five-stage USNBC network with

n = 2

,

m = 2 n = 4

, and

r = 3 n^{2} = 12

. There are

m = 2 n = 4

building blocks (3-stage USNBC, Figure 1a) in the middle stage.

A three-stage ISNBC network, the folded version of the five-stage USNBC network with

n = 2

,

m = 2 n = 4

, and

r = 3 n^{2} = 12

, is shown in Figure 7. It uses equally sized

(n + m) \times (m + n) = 6 \times 6

square crossbars for all switches. Four building blocks are depicted in the two switch columns on the right, and each building block is a two-stage ISNBC network with compute nodes removed, as shown in Figure 1b. It can utilize shortcut connections to reduce communication path lengths.

Figure 7. A three-stage ISNBC network with

n = 2

composed of equally sized

(n + m) \times (m + n) = 6 \times 6

square crossbars (folded version of Figure 6).

3.1.3. Four-Stage ISNBC Networks

To construct a four-stage ISNBC network, a seven-stage USNBC network is constructed first. Similarly, by using five-stage USNBC networks as building blocks, a seven-stage USNBC network can be constructed. For

m = 2 n

, five-stage building block has

3 n^{3}

inputs and

3 n^{3}

outputs. m such building blocks are arranged in the middle stage. Then, in total, there are

3 n^{3} \times m = 6 n^{4}

inputs and

3 n^{3} \times m = 6 n^{4}

outputs in the middle stage. Correspondingly,

r = 3 n^{3}

switches in the ingress stage and

r = 3 n^{3}

switches in the egress stage are arranged.

In summary, to construct a seven-stage USNBC network, given n, which is the number of inputs per switch in the ingress stage, there are

r = 3 n^{3}

switches in the ingress stage, and each switch is an

n \times m

crossbar with

m = 2 n

. There are m building blocks in the middle stage, and each building block is a five-stage USNBC network with compute nodes removed. There are

r = 3 n^{3}

switches in the egress stage, and each switch is an

m \times n

crossbar with

m = 2 n

, as shown in Formula (3). The linking method is similar to that in the five-stage USNBC network.

\begin{matrix} \{\begin{matrix} m & = 2 n \\ r & = (n + m) n^{2} = 3 n^{3} \dots \dots (7-stage USNBC) \\ N & = n r = 3 n^{4} \end{matrix} \end{matrix}

(3)

Figure 8 shows a seven-stage USNBC network with

n = 2

,

m = 2 n = 4

, and

r = 3 n^{3} = 24

. It has

N = n r = 3 n^{4} = 48

compute nodes. There are

m = 2 n = 4

building blocks in the middle stage, and each building block is a five-stage USNBC network with compute nodes removed. The detailed networks of building blocks are shown at the bottom of the figure.

Figure 8. A 7-stage USNBC network with

n = 2

,

m = 2 n = 4

, and

r = 3 n^{3} = 24

. There are

m = 2 n = 4

building blocks (5-stage USNBC, Figure 6) in the middle stage.

A four-stage ISNBC network, the folded version of the seven-stage USNBC network with

n = 2

,

m = 2 n = 4

, and

r = 3 n^{3} = 24

, is shown in Figure 9. It uses the equally sized

(n + m) \times (m + n) = 6 \times 6

square crossbars for all switches. Four three-stage ISNBC networks are shown in the three switch columns on the right. It can utilize shortcut connections to reduce communication path lengths.

Figure 9. A four-stage ISNBC network with

n = 2

composed of equally sized

(n + m) \times (m + n) = 6 \times 6

square crossbars (folded version of Figure 8).

Table 4 lists the numbers of compute nodes and switches of two-, three-, and four-stage ISNBC networks. Let s be the number of stages. Then, the number of compute nodes is

N = 3 n^{s}

, and the number of switches is

(2^{s + 1} - 3) n^{s - 1}

. The derivation of the formula is given in Theorem A1 in Appendix B. The crossbar size is listed in the right column.

Table 4. The numbers of compute nodes and switches in ISNBC networks.

3.2. Proposed Identical Rearrangeably NonBlocking Folded Clos (IRNBC) Networks

This subsection presents a URNBC network and the corresponding IRNBC network composed of square crossbars of the same size.

3.2.1. Two-Stage IRNBC Networks

To construct a two-stage IRNBC network, a three-stage URNBC network is constructed first. In a three-stage URNBC network, let n be the number of inputs per switch in the ingress stage, m be the number of switches in the middle stage, and r be the number of switches in the ingress and egress stages. Let

m = n

to ensure the rearrangeably nonblocking property. Let

r = n + m = 2 n

to ensure that the IRNBC network uses equally sized square crossbars.

In the ingress stage, there are r switches, and each switch is an

n \times m

crossbar (n inputs and m outputs); in the middle stage, there are m switches, and each switch is an

r \times r

crossbar; and in the egress stage, there are r switches, and each switch is an

m \times n

crossbar. Each output of a switch in the ingress stage is connected to an input of a different switch in the middle stage. Since there are r switches in the ingress stage, the number of inputs of a switch in the middle stage must be r — one input for an output of the r switches in the ingress stage. Each output of a switch in the middle stage is connected to an input of a different switch in the egress stage. Because the number of outputs of a switch in the middle stage is also r, each output is connected to an input of r switches in the egress stage. Then, the number of compute nodes is

N = n r = 2 n^{2}

. In summary, to construct a three-stage URNBC network, m and r are determined as shown in Formula (4).

\begin{matrix} \{\begin{matrix} m & = n \\ r & = n + m = 2 n \dots \dots (3-stage URNBC) \\ N & = n r = 2 n^{2} \end{matrix} \end{matrix}

(4)

To construct a folded version of a three-stage URNBC network, the corresponding position switches in the ingress and egress stages are merged and expanded so that there are

n + m

inputs and

m + n

outputs in the combined switch. Then, these switches have the same number of inputs and outputs, which is

n + m

. The switch in the middle stage has r inputs and r outputs with

r = n + m

. Therefore, all switches in the folded Clos network use equally sized

(n + m) \times (m + n) = 2 n \times 2 n

crossbars.

Figure 10a shows a three-stage URNBC network with

n = 2

,

m = n = 2

, and

r = n + m = 4

. It has

N = n r = 8

compute nodes. A two-stage IRNBC network, the folded version of the three-stage URNBC network with

n = 2

,

m = n = 2

, and

r = n + m = 4

, is shown in Figure 10b. It uses equally sized

2 n \times 2 n = 4 \times 4

square crossbars for all switches. It can utilize shortcut connections to reduce communication path lengths. For example, if the source and destination nodes are connected to the same leaf switch, the communication does not need to go through the root switch.

Figure 10. Proposed rearrangeably nonblocking Clos networks (

m = n

and

r = n + m

). (a) A 3-stage URNBC network with

n = 2

,

m = n = 2

, and

r = n + m = 4

. (b) A 2-stage IRNBC network composed of equally sized

(n + m) \times (m + n) = 4 \times 4

square crossbars.

Figure 11 shows the blocking and rearrangements in a rearrangeably nonblocking Clos network. Referring to Figure 11a, connection

1 \to 4

cannot be built because source node 2 (connected to the same switch as node 1) and destination node 3 (connected to the same switch as node 4) use different middle switches for their connections (

2 \to 5

and

7 \to 3

). Figure 11b shows the case of the folded version, where a bidirectional link consists of two oppositely oriented unidirectional links. Figure 11c–f show that the

1 \to 4

connection can be constructed after the rearrangement of existing connections.

Figure 11. Blocking and rearrangements in a rearrangeably nonblocking Clos network (

m = n = 2

and

r = n + m = 4

). (a) Two connections (

2 \to 5

and

7 \to 3

) were built. The

1 \to 4

connection cannot be built. (b) The case of the folded version of (a). Note that a bidirectional link consists of two oppositely oriented unidirectional links. (c) Two connections (

2 \to 5

and

7 \to 3

) were built. The

1 \to 4

connection can also be built by the rearrangements of (a). (d) The case of the folded version of (c). (e) Two connections (

2 \to 5

and

7 \to 3

) were built. The

1 \to 4

connection can also be built by the rearrangements of (a). (f) The case of the folded version of (e).

Figure 12a shows a three-stage URNBC network with

n = 3

,

m = n = 3

, and

r = n + m = 6

. It has

N = n r = 18

compute nodes. A two-stage IRNBC network, the folded version of the three-stage URNBC network with

n = 3

,

m = n = 3

, and

r = n + m = 6

, is shown in Figure 12b. It uses the equally sized

2 n \times 2 n = 6 \times 6

square crossbars.

Figure 12. Proposed rearrangeably nonblocking Clos networks (

m = n

and

r = n + m

). (a) A 3-stage URNBC network with

n = 3

,

m = n = 3

, and

r = n + m = 6

. (b) A 2-stage IRNBC network composed of equally sized

2 n \times 2 n = 6 \times 6

square crossbars.

Table 5 and Table 6 show the differences between this work and existing original work. Table 5 compares the proposed three-stage URNBC network with a traditional three-stage unidirectional rearrangeably nonblocking Clos network. Table 6 compares the proposed two-stage IRNBC network with a traditional two-stage rearrangeably nonblocking folded Clos network. The

[x \times y] \times z

expression in the switch column indicates that there are z switches, and each switch is an

x \times y

crossbar. N is the total number of compute nodes in the network. The IRNBC network uses equally sized square crossbar switches in the leaf and root stages.

Table 5. Comparison of 3-stage unidirectional rearrangeably nonblocking Clos networks.

Table 6. Comparison of 2-stage rearrangeably nonblocking folded Clos networks.

3.2.2. Three-Stage IRNBC Networks

To construct a three-stage IRNBC network, a five-stage URNBC network is constructed first. By using three-stage URNBC networks as building blocks, a five-stage URNBC network can be constructed. For

m = n

, a three-stage URNBC network has

N = (n + m) n = 2 n^{2}

compute nodes. As building blocks, the compute nodes are removed so that the three-stage URNBC network has

(n + m) n = 2 n^{2}

inputs and

(n + m) n = 2 n^{2}

outputs. m such building blocks are arranged in the middle stage. Then, in total, there are

2 n^{2} \times m = 2 n^{3}

inputs and

2 n^{2} \times m = 2 n^{3}

outputs in the middle stage. Correspondingly, the same number of outputs in the ingress stage and the same number of inputs in the egress stage can be arranged. Let r be the number of switches in the ingress and egress stages; then,

r m

must be equal to

2 n^{3}

. Therefore,

r = 2 n^{3} / m = 2 n^{2} \times m / m = 2 n^{2} = (n + m) n

.

In summary, to construct a five-stage URNBC network, given n, which is the number of inputs per switch in the ingress stage, there are

r = 2 n^{2}

switches in the ingress stage, and each switch is an

n \times m

crossbar with

m = n

. There are m building blocks in the middle stage, and each building block is a three-stage URNBC network with compute nodes removed. There are

r = 2 n^{2}

switches in the egress stage, and each switch is an

m \times n

crossbar with

m = n

, as shown in Formula (5). The linking method is similar to the three-stage URNBC network; each output of a switch in the ingress stage is connected to an input of a different building block in the middle stage. Each output of a building block in the middle stage is connected to an input of a different switch in the egress stage.

\begin{matrix} \{\begin{matrix} m & = n \\ r & = (n + m) n = 2 n^{2} \dots \dots (5-stage URNBC) \\ N & = n r = 2 n^{3} \end{matrix} \end{matrix}

(5)

Figure 13 shows a five-stage URNBC network with

n = 2

,

m = n = 2

, and

r = 2 n^{2} = 8

. It has

N = n r = 2 n^{3} = 16

compute nodes. There are

m = n = 2

building blocks in the middle stage, and each building block is a three-stage URNBC network with compute nodes removed. The detailed network of a building block is shown at the bottom of the figure.

Figure 13. A 5-stage URNBC network with

n = 2

,

m = n = 2

, and

r = 2 n^{2} = 8

. There are

m = n = 2

building blocks (3-stage URNBC, Figure 10a) in the middle stage.

A three-stage IRNBC network, the folded version of the five-stage URNBC network with

n = 2

,

m = n = 2

, and

r = 2 n^{2} = 8

, is shown in Figure 14. It uses equally sized

2 n \times 2 n = 4 \times 4

square crossbars for all switches. Two two-stage IRNBC networks (Figure 10b) are shown in the two switch columns on the right.

Figure 14. A 3-stage IRNBC network with

n = 2

composed of equally sized

2 n \times 2 n = 4 \times 4

square crossbars (folded version of Figure 13).

3.2.3. Four-Stage IRNBC Networks

To construct a four-stage IRNBC network, a seven-stage URNBC network is constructed first. Similarly, by using five-stage URNBC networks as building blocks, a seven-stage URNBC network can be constructed. In summary, to construct a seven-stage URNBC network, m and r are determined as shown in Formula (6). The linking method is similar to the five-stage URNBC network.

\begin{matrix} \{\begin{matrix} m & = n \\ r & = (n + m) n^{2} = 2 n^{3} \dots \dots (7-stage URNBC) \\ N & = n r = 2 n^{4} \end{matrix} \end{matrix}

(6)

Figure 15 shows a seven-stage URNBC network with

n = 2

,

m = n = 2

, and

r = 2 n^{3} = 16

. It has

N = n r = 2 n^{4} = 32

compute nodes. There are

m = n = 2

building blocks in the middle stage, and each building block is a five-stage URNBC network with compute nodes removed. The detailed networks of building blocks are shown at the bottom of the figure.

Figure 15. A 7-stage URNBC network with

n = 2

,

m = n = 2

, and

r = 2 n^{3} = 16

. There are

m = n = 2

building blocks (5-stage URNBC, Figure 13) in the middle stage.

A four-stage IRNBC network, the folded version of the seven-stage URNBC network with

n = 2

,

m = n = 2

, and

r = 2 n^{3} = 16

, is shown in Figure 16. It uses equally sized

2 n \times 2 n = 4 \times 4

square crossbars for all switches. Two three-stage IRNBC networks (Figure 14) are shown in the three switch columns on the right.

Figure 16. A 4-stage IRNBC network with

n = 2

composed of equally sized

2 n \times 2 n = 4 \times 4

square crossbars (folded version of Figure 15).

Table 7 lists the numbers of compute nodes and switches of two-, three-, and four-stage IRNBC networks. Let s be the number of stages. Then, the number of compute nodes is

N = 2 n^{s}

, and the number of switches is

(2 s - 1) n^{s - 1}

. The derivation of the formula is given in Theorem A2 in Appendix B. The crossbar size is listed in the right column.

Table 7. The numbers of compute nodes and switches in the IRNBC networks.

4. Cost Evaluations

This section evaluates the hardware cost for USNBC, ISNBC, URNBC, and IRNBC networks from the perspective of switch crosspoints and compares them to the corresponding traditional Clos networks. Here, we evaluate the costs of the 24 networks listed in Table 8.

Table 8. Nonblocking Clos networks for cost evaluations.

4.1. Cost Evaluations of Strictly Nonblocking Clos Networks

This subsection investigates the crosspoint ratios relative to a single crossbar for unidirectional strictly nonblocking Clos (USNBC) networks and identical strictly nonblocking folded Clos (ISNBC) networks. These ratios are compared to those of the corresponding traditional Clos networks.

4.1.1. Cost Evaluations of USNBC Networks

An

n \times m

crossbar (n inputs and m outputs) has

n m

crosspoints. In the proposed strictly nonblocking Clos networks,

m = 2 n

. Referring to Figure 1a, in a three-stage USNBC network, there are

r = n + m = 3 n

switches in the ingress stage, and each switch is an

n \times m = n \times 2 n

crossbar. There are m switches in the middle stage, and each switch is an

r \times r = 3 n \times 3 n

crossbar. There are

r = m + n = 3 n

switches in the egress stage, and each switch is an

m \times n = 2 n \times n

crossbar. The number of total crosspoints is

n m \times r + r^{2} \times m + m n \times r = 2 n^{2} \times 3 n + 9 n^{2} \times 2 n + 2 n^{2} \times 3 n = 30 n^{3}

. There are

N = n r = 3 n^{2}

inputs and

N = 3 n^{2}

outputs. A single

N \times N

crossbar requires

N \times N = 9 n^{4}

crosspoints. The crosspoint ratio of the three-stage USNBC network relative to a single crossbar is

30 n^{3} / (9 n^{4}) = 10 / (3 n)

, which is less than 1 if

n \geq 4

. For example, when

n = 4

, the three-stage USNBC network requires

30 n^{3} = 1920

crosspoints, which is less than

9 n^{4} = 2304

crosspoints in the single crossbar’s implementation. In contrast, a traditional strictly nonblocking Clos network requires

n \geq 6

, as mentioned in Section 2.

Referring to Figure 6, in the five-stage case, there are

2 n^{2} \times 3 n^{2}

crosspoints in the ingress stage, there are

30 n^{3} \times 2 n

crosspoints in the middle stage, and there are

2 n^{2} \times 3 n^{2}

crosspoints in the egress stage, where

30 n^{3}

is the number of crosspoints in a three-stage USNBC network, as derived above. The total number of the crosspoints is

6 n^{4} + 60 n^{4} + 6 n^{4} = 72 n^{4}

. There are

N = n r = 3 n^{3}

inputs and

3 n^{3}

outputs. A single

N \times N

crossbar requires

9 n^{6}

crosspoints. The crosspoint ratio of the five-stage USNBC network relative to a single crossbar is

72 n^{4} / (9 n^{6}) = 24 / (3 n^{2})

, which is less than 1 if

n \geq 3

.

Referring to Figure 8, in the seven-stage case, there are

2 n^{2} \times 3 n^{3}

crosspoints in the ingress stage, there are

72 n^{4} \times 2 n

crosspoints in the middle stage, and there are

2 n^{2} \times 3 n^{3}

crosspoints in the egress stage, where

72 n^{4}

is the number of crosspoints in a five-stage USNBC network, as derived above. The total number of the crosspoints is

6 n^{5} + 144 n^{5} + 6 n^{5} = 156 n^{5}

. There are

N = n r = 3 n^{4}

inputs and

N = 3 n^{4}

outputs. A single

N \times N

crossbar requires

9 n^{8}

crosspoints. The crosspoint ratio of the seven-stage USNBC network relative to a single crossbar is

156 n^{5} / (9 n^{8}) = 52 / (3 n^{3})

, which is less than 1 if

n \geq 3

.

The number of crosspoints for a traditional unidirectional strictly nonblocking Clos network [1] is examined below. A three-stage traditional unidirectional strictly nonblocking Clos network has n switches in the ingress stage,

m = 2 n - 1

switches in the middle stage, and n switches in the egress stage. An ingress-stage switch is an

n \times m

crossbar, a middle-stage switch is an

n \times n

crossbar, and an egress-stage switch is an

m \times n

crossbar. Then, the total number of crosspoints is

n m \times n + n^{2} \times m + m n \times n = 3 n^{2} m = 3 n^{2} (2 n - 1)

. The total number of compute nodes is

N = n^{2}

. A single

N \times N

crossbar requires

n^{4}

crosspoints. The crosspoint ratio of a three-stage traditional unidirectional strictly nonblocking Clos network relative to a single crossbar is

3 n^{2} (2 n - 1) / n^{4} = 3 (2 n - 1) / n^{2}

. To guarantee that the ratio is less than 1,

n \geq 6

is needed.

Consider the five-stage case. There are

n m \times n^{2}

crosspoints in the ingress stage, there are

3 n^{2} (2 n - 1) \times m

crosspoints in the middle stage, and there are

m n \times n^{2}

crosspoints in the egress stage, where

3 n^{2} (2 n - 1)

is the number of crosspoints in a three-stage traditional unidirectional strictly nonblocking Clos network, as derived above. The total number of crosspoints is

n m \times n^{2} + 3 n^{2} (2 n - 1) \times m + m n \times n^{2} = n^{3} (2 n - 1) + 3 n^{2} (2 n - 1) (2 n - 1) + n^{3} (2 n - 1) = n^{2} (8 n - 3)

(2 n - 1)

. The total number of compute nodes is

N = n^{3}

. A single

N \times N

crossbar requires

n^{6}

crosspoints. The crosspoint ratio of a five-stage traditional unidirectional strictly nonblocking Clos network relative to a single crossbar is

n^{2} (8 n - 3) (2 n - 1) / n^{6} = (8 n - 3) (2 n - 1) / n^{4}

. To guarantee that the ratio is less than 1,

n \geq 4

is needed.

Consider the seven-stage case. There are

n m \times n^{3}

crosspoints in the ingress stage, there are

n^{2} (8 n - 3) (2 n - 1) \times m

crosspoints in the middle stage, and there are

m n \times n^{3}

crosspoints in the egress stage, where

n^{2} (8 n - 3) (2 n - 1)

is the number of crosspoints in a five-stage traditional unidirectional strictly nonblocking Clos network, as derived above. The total number of crosspoints is

n m \times n^{3} + n^{2} (8 n - 3) (2 n - 1) \times m + m n \times n^{3} = n^{4} (2 n - 1) + n^{2} (8 n - 3) (2 n - 1) (2 n - 1) + n^{4} (2 n - 1) =

n^{2} (18 n^{2} - 14 n + 3) (2 n - 1)

. The total number of compute nodes is

N = n^{4}

. A single

N \times N

crossbar requires

n^{8}

crosspoints. The crosspoint ratio of a seven-stage traditional unidirectional strictly nonblocking Clos network relative to a single crossbar is

n^{2} (18 n^{2} - 14 n + 3) (2 n - 1) / n^{8} = (18 n^{2} - 14 n + 3) (2 n - 1) / n^{6}

. To guarantee that the ratio is less than 1,

n \geq 4

is needed.

Table 9 summarizes the crosspoint ratio relative to a single crossbar for traditional unidirectional strictly nonblocking Clos networks and USNBC networks.

Table 9. Crosspoint ratio for unidirectional strictly nonblocking Clos networks.

Figure 17 plots the crosspoint ratio relative to a single crossbar for unidirectional strictly nonblocking Clos networks, showing that USNBC networks have a lower crosspoint cost than traditional strictly nonblocking Clos networks.

Figure 17. Crosspoint ratio relative to a single crossbar in unidirectional strictly nonblocking Clos networks.

4.1.2. Cost Evaluations of ISNBC Networks

The number of crosspoints for an ISNBC network that uses equally sized square crossbars of

(n + m) \times (m + n) = 3 n \times 3 n

for

m = 2 n

is examined below. Referring to Figure 1b, an ISNBC network based on the three-stage USNBC network has two stages. There are

r = n + m = 3 n

leaf switches and m root switches. The total number of switches is

r + m = 5 n

, and each switch is a square

(n + m) \times (m + n) = 3 n \times 3 n

crossbar. Then, the total number of crosspoints is

3 n \times 3 n \times 5 n = 45 n^{3}

. The total number of compute nodes is

N = n r = 3 n^{2}

.

A single

N \times N

crossbar requires

9 n^{4}

crosspoints. The crosspoint ratio of a two-stage ISNBC network relative to a single crossbar is

45 n^{3} / (9 n^{4}) = 5 / n

. To guarantee that the ratio is less than 1,

n \geq 6

is needed.

Consider a three-stage ISNBC network. Referring to Figure 7, there are

r = (n + m) n = 3 n^{2}

switches in the leaf stage; there are m building blocks, and each building block is a two-stage ISNBC network whose number of switches is

5 n

, as derived above. The total number of switches is

3 n^{2} + 5 n \times 2 n = 13 n^{2}

, and each switch is a square

(n + m) \times (m + n) = 3 n \times 3 n

crossbar. Then, the total number of crosspoints is

3 n \times 3 n \times 13 n^{2} = 117 n^{4}

. The total number of compute nodes is

N = n r = 3 n^{3}

.

A single

N \times N

crossbar requires

9 n^{6}

crosspoints. The crosspoint ratio of a three-stage ISNBC network relative to a single crossbar is

117 n^{4} / (9 n^{6}) = 13 / n^{2}

. To guarantee that the ratio is less than 1,

n \geq 4

is needed.

Consider a four-stage ISNBC network. Referring to Figure 9, there are

r = (n + m) n^{2} = 3 n^{3}

switches in the leaf stage; there are m building blocks, and each building block is a three-stage ISNBC network whose number of switches is

13 n^{2}

, as derived above. The total number of switches is

3 n^{3} + 13 n^{2} \times 2 n = 29 n^{3}

, and each switch is a square

(n + m) \times (m + n) = 3 n \times 3 n

crossbar. Then, the total number of crosspoints is

3 n \times 3 n \times 29 n^{3} = 261 n^{5}

. The total number of compute nodes is

N = n r = 3 n^{4}

.

A single

N \times N

crossbar requires

9 n^{8}

crosspoints. The crosspoint ratio of a four-stage ISNBC network relative to a single crossbar is

261 n^{5} / (9 n^{8}) = 29 / n^{3}

. To guarantee that the ratio is less than 1,

n \geq 4

is needed.

Table 10 lists the crosspoints of ISNBC networks. The “Crossbar” column shows the number of crosspoints in a single crossbar. The “ISNBC” column shows the number of crosspoints in an ISNBC network. The number of crosspoints of the ISNBC network is better (smaller) than that of a single crossbar when

n \geq 6

for the two-stage ISNBC network and

n \geq 4

for the three- and four-stage ISNBC networks.

Table 10. The numbers of crosspoints in ISNBC networks.

The number of crosspoints for a traditional strictly nonblocking folded Clos network that uses crossbars of different sizes is examined below. A two-stage traditional strictly nonblocking folded Clos network has n switches in the leaf stage and

m = 2 n - 1

switches in the root stage. A leaf switch is an

(n + m) \times (m + n) = (3 n - 1) \times (3 n - 1)

crossbar. A root switch is an

n \times n

crossbar. Then, the total number of crosspoints is

{(3 n - 1)}^{2} \times n + n^{2} \times m = n (11 n^{2} - 7 n + 1)

. The total number of compute nodes is

N = n^{2}

.

A single

N \times N

crossbar requires

n^{4}

crosspoints. The crosspoint ratio of a two-stage traditional strictly nonblocking folded Clos network relative to a single crossbar is

n (11 n^{2} - 7 n + 1) / n^{4} = (11 n^{2} - 7 n + 1) / n^{3}

. To guarantee that the ratio is less than 1,

n \geq 11

is needed.

Consider the three-stage case. There are

n^{2}

switches in the leaf stage, and each switch is an

(n + m) \times (m + n) = (3 n - 1) \times (3 n - 1)

crossbar. There are

m = 2 n - 1

building blocks, and each building block has

n (11 n^{2} - 7 n + 1)

crosspoints, as derived above. Then, the total number of crosspoints is

{(3 n - 1)}^{2} \times n^{2} + n (11 n^{2} - 7 n + 1) \times (2 n - 1) = n (31 n^{3} - 31 n^{2} + 10 n - 1)

. The total number of compute nodes is

N = n^{3}

. A single

N \times N

crossbar requires

n^{6}

crosspoints. The crosspoint ratio of a three-stage traditional strictly nonblocking folded Clos network relative to a single crossbar is

n (31 n^{3} - 31 n^{2} + 10 n - 1) / n^{6} = (31 n^{3} - 31 n^{2} + 10 n - 1) / n^{5}

. To guarantee that the ratio is less than 1,

n \geq 6

is needed.

Consider the four-stage case. There are

n^{3}

switches in the leaf stage, and each switch is an

(n + m) \times (m + n) = (3 n - 1) \times (3 n - 1)

crossbar. There are

m = 2 n - 1

building blocks, and each building block has

n (31 n^{3} - 31 n^{2} + 10 n - 1)

crosspoints, as derived above. Then, the total number of crosspoints is

{(3 n - 1)}^{2} \times n^{3} + n (31 n^{3} - 31 n^{2} + 10 n - 1) \times m = n (71 n^{4} - 99 n^{3} + 52 n^{2} - 12 n + 1)

. The total number of compute nodes is

N = n^{4}

. A single

N \times N

crossbar requires

n^{8}

crosspoints. The crosspoint ratio of a four-stage traditional strictly nonblocking folded Clos network relative to a single crossbar is

n (71 n^{4} - 99 n^{3} + 52 n^{2} - 12 n + 1) / n^{8} = (71 n^{4} - 99 n^{3} + 52 n^{2} - 12 n + 1) / n^{7}

. To guarantee that the ratio is less than 1,

n \geq 4

is needed.

Table 11 summarizes the crosspoint ratio relative to a single crossbar for traditional strictly nonblocking folded Clos networks and ISNBC networks. The general formula for calculating the ISNBC crosspoint ratio relative to a single crossbar is

(2^{s + 1} - 3) / n^{s - 1}

, where s is the number of stages and the number of compute nodes is

N = 3 n^{s}

.

Table 11. Crosspoint ratio for strictly nonblocking folded Clos networks.

Table 12 lists the crosspoint ratios relative to a single crossbar for strictly nonblocking folded Clos networks. The ratios are calculated based on the formulas in Table 11.

Table 12. Crosspoint ratio relative to a single crossbar in strictly nonblocking folded Clos networks.

Figure 18 plots the crosspoint ratio relative to a single crossbar for strictly nonblocking folded Clos networks, showing that ISNBC networks have a lower crosspoint cost than traditional strictly nonblocking folded Clos networks. Also note that ISNBC networks use equally sized square crossbars for all switches in the network.

Figure 18. Crosspoint ratio relative to a single crossbar in strictly nonblocking folded Clos networks.

Figure 19 plots the crosspoint ratio relative to a single crossbar for strictly nonblocking folded Clos networks, showing that ISNBC networks have a lower crosspoint cost than the strictly nonblocking folded Clos network proposed in [16]. The network proposed in [16] is labeled “Redesign” in the figure, only supports two stages, and uses multiple links (

v = 2, 3, \dots

) between a leaf switch and a root switch. If the network is constructed using equally sized crossbar switches, there will be unused ports on the leaf switch, the root switch, or both the leaf and root switches.

Figure 19. Crosspoint ratio relative to a single crossbar in strictly nonblocking folded Clos networks.

4.2. Cost Evaluations of Rearrangeably Nonblocking Clos Networks

This subsection investigates the crosspoint ratios relative to a single crossbar for unidirectional rearrangeably nonblocking Clos (URNBC) networks and identical rearrangeably nonblocking folded Clos (IRNBC) networks. These ratios are compared to the corresponding traditional Clos networks. It is unfair to compare a rearrangeably nonblocking Clos network to a single crossbar, since a single crossbar is a strictly nonblocking network. The reason the ratios are presented here is to make it easier to see the difference between the proposed network and a traditional network.

4.2.1. Cost Evaluations of URNBC Networks

As described in the previous section, in URNBC networks,

m = n

and

r = n + m = 2 n

. Referring to Figure 10a, in a three-stage URNBC network, there are

r = n + m = 2 n

switches in the ingress stage, and each switch is an

n \times m

crossbar; there are m switches in the middle stage, and each switch is an

r \times r

crossbar; and there are

r = n + m = 2 n

switches in the egress stage, and each switch is an

m \times n

crossbar. The total number of crosspoints is

n m \times r + r^{2} \times m + m n \times r = 2 n^{3} + 4 n^{3} + 2 n^{3} = 8 n^{3}

. There are

N = n r = 2 n^{2}

compute nodes. A single

N \times N

crossbar requires

N \times N = 4 n^{4}

crosspoints. The crosspoint ratio of a three-stage URNBC network relative to a single crossbar is

8 n^{3} / (4 n^{4}) = 2 / n

.

Referring to Figure 13, in a five-stage URNBC network, there are

r = (n + m) n = 2 n^{2}

switches in the ingress stage, and each switch is an

n \times m

crossbar; there are m three-stage URNBC networks, and each URNBC network has

8 n^{3}

crosspoints, as derived above; and there are

r = (n + m) n = 2 n^{2}

switches in the egress stage, and each switch is an

m \times n

crossbar. The total number of crosspoints is

n m \times 2 n^{2} + 8 n^{3} \times m + m n \times 2 n^{2} = 2 n^{4} + 8 n^{4} + 2 n^{4} = 12 n^{4}

. There are

N = n r = 2 n^{3}

compute nodes. A single

N \times N

crossbar requires

4 n^{6}

crosspoints. The crosspoint ratio of a five-stage URNBC network relative to a single crossbar is

12 n^{4} / (4 n^{6}) = 3 / n^{2}

.

Referring to Figure 15, in a seven-stage URNBC network, there are

r = (n + m) n^{2} = 2 n^{3}

switches in the ingress stage, and each switch is an

n \times m

crossbar; there are m five-stage URNBC networks, and each URNBC network has

12 n^{4}

crosspoints, as derived above; and there are

r = (n + m) n^{2} = 2 n^{3}

switches in the egress stage, and each switch is an

m \times n

crossbar. The total number of crosspoints is

n m \times 2 n^{3} + 12 n^{4} \times m + m n \times 2 n^{3} = 2 n^{5} + 12 n^{5} + 2 n^{5} = 16 n^{5}

. There are

N = n r = 2 n^{4}

compute nodes. A single

N \times N

crossbar requires

4 n^{8}

crosspoints. The crosspoint ratio of a seven-stage URNBC network relative to a single crossbar is

16 n^{5} / (4 n^{8}) = 4 / n^{3}

.

The number of crosspoints for a traditional unidirectional rearrangeably nonblocking Clos network is examined below. In a three-stage traditional unidirectional rearrangeably nonblocking Clos network,

m = n

and

N = n^{2}

. The total number of crosspoints is

n m \times n + n^{2} \times m + m n \times n = n^{3} + n^{3} + n^{3} = 3 n^{3}

. A single

N \times N

crossbar requires

n^{4}

crosspoints. Therefore, the crosspoint ratio of a three-stage traditional unidirectional rearrangeably nonblocking Clos network relative to a single crossbar is

3 n^{3} / n^{4} = 3 / n

.

In the five-stage case,

m = n

and

N = n^{3}

. The total number of crosspoints is

n m \times n^{2} + 3 n^{3} \times m + m n \times n^{2} = n^{4} + 3 n^{4} + n^{4} = 5 n^{4}

, where

3 n^{3}

is the number of crosspoints in a three-stage traditional unidirectional rearrangeably nonblocking Clos network, as derived above. A single

N \times N

crossbar requires

n^{6}

crosspoints. Therefore, the crosspoint ratio of a five-stage traditional unidirectional rearrangeably nonblocking Clos network relative to a single crossbar is

5 n^{4} / n^{6} = 5 / n^{2}

.

In the seven-stage case,

m = n

and

N = n^{4}

. The total number of crosspoints is

n m \times n^{3} + 5 n^{4} \times m + m n \times n^{3} = n^{5} + 5 n^{5} + n^{5} = 7 n^{5}

, where

5 n^{4}

is the number of crosspoints in a five-stage traditional unidirectional rearrangeably nonblocking Clos network, as derived above. A single

N \times N

crossbar requires

n^{8}

crosspoints. Therefore, the crosspoint ratio of a seven-stage traditional unidirectional rearrangeably nonblocking Clos network relative to a single crossbar is

7 n^{5} / n^{8} = 7 / n^{3}

.

Table 13 summarizes the crosspoint ratio relative to a single crossbar for traditional unidirectional rearrangeably nonblocking Clos networks and URNBC networks. The cost ratios for the proposed URNBC networks relative to traditional networks are

66.67 %

,

60.00 %

, and

57.14 %

for three-stage, five-stage, and seven-stage networks, respectively.

Table 13. Crosspoint ratio for unidirectional rearrangeably nonblocking Clos networks.

Figure 20 plots the crosspoint ratio relative to a single crossbar for unidirectional rearrangeably nonblocking Clos networks, showing that the URNBC networks have a lower crosspoint cost than traditional rearrangeably nonblocking Clos networks.

Figure 20. Crosspoint ratio relative to a single crossbar in unidirectional rearrangeably nonblocking Clos networks.

4.2.2. Cost Evaluations of IRNBC Networks

The number of crosspoints for IRNBC networks that use equally sized square crossbars of

(n + m) \times (m + n) = 2 n \times 2 n

for

m = n

is examined below. Referring to Figure 10b, in a two-stage IRNBC network, there are

r = n + m = 2 n

switches in the leaf stage and m switches in the root stage. The total number of crosspoints is

(n + m) \times (m + n) \times (2 n + m) = 12 n^{3}

. There are

N = n r = 2 n^{2}

compute nodes. A single

N \times N

crossbar requires

N \times N = 4 n^{4}

crosspoints. The crosspoint ratio of a two-stage IRNBC network relative to a single crossbar is

12 n^{3} / (4 n^{4}) = 3 / n

.

Referring to Figure 14, in a three-stage IRNBC network, the total number of crosspoints is

(n + m) \times (m + n) \times 2 n^{2} + 12 n^{3} \times m = 8 n^{4} + 12 n^{4} = 20 n^{4}

, where

12 n^{3}

is the number of crosspoints in a two-stage IRNBC network, as derived above. There are

N = n r = 2 n^{3}

compute nodes. A single

N \times N

crossbar requires

4 n^{6}

crosspoints. The crosspoint ratio of a three-stage IRNBC network relative to a single crossbar is

20 n^{4} / (4 n^{6}) = 5 / n^{2}

.

Referring to Figure 16, in a four-stage IRNBC network, the total number of crosspoints is

(n + m) \times (m + n) \times 2 n^{3} + 20 n^{4} \times m = 8 n^{5} + 20 n^{5} = 28 n^{5}

, where

20 n^{4}

is the number of crosspoints in a three-stage IRNBC network, as derived above. There are

N = n r = 2 n^{4}

compute nodes. A single

N \times N

crossbar requires

4 n^{8}

crosspoints. The crosspoint ratio of a four-stage IRNBC network relative to a single crossbar is

28 n^{5} / (4 n^{8}) = 7 / n^{3}

.

Table 14 lists the crosspoints of IRNBC networks. The “Crossbar” column shows the number of crosspoints in a single crossbar. The “IRNBC” column shows the number of crosspoints in an IRNBC network. The number of crosspoints of the IRNBC network is better (smaller) than that of a single crossbar when

n \geq 4

for the two-stage IRNBC network,

n \geq 3

for the three-stage IRNBC network, and

n \geq 2

for the four-stage IRNBC networks.

Table 14. The number of crosspoints in IRNBC networks.

The number of crosspoints for traditional rearrangeably nonblocking folded Clos networks with

m = n

is examined below. In a two-stage traditional rearrangeably nonblocking folded Clos network, the total number of crosspoints is

(n + m) (m + n) \times n + n^{2} \times m = 4 n^{3} + n^{3} = 5 n^{3}

. There are

N = n^{2}

compute nodes. A single

N \times N

crossbar requires

N \times N = n^{4}

crosspoints. The crosspoint ratio of a two-stage traditional rearrangeably nonblocking folded Clos network relative to a single crossbar is

5 n^{3} / n^{4} = 5 / n

.

In the three-stage case, the total number of crosspoints is

(n + m) (m + n) \times n^{2} + 5 n^{3} \times m = 4 n^{4} + 5 n^{4} = 9 n^{4}

, where

5 n^{3}

is the number of crosspoints in a two-stage traditional rearrangeably nonblocking folded Clos network, as derived above. There are

N = n^{3}

compute nodes. A single

N \times N

crossbar requires

n^{6}

crosspoints. The crosspoint ratio of a three-stage traditional rearrangeably nonblocking folded Clos network relative to a single crossbar is

9 n^{4} / n^{6} = 9 / n^{2}

.

In the four-stage case, the total number of crosspoints is

(n + m) (m + n) \times n^{3} + 9 n^{4} \times m = 4 n^{5} + 9 n^{5} = 13 n^{5}

, where

9 n^{4}

is the number of crosspoints in a three-stage traditional rearrangeably nonblocking folded Clos network, as derived above. The crosspoint ratio of a four-stage traditional rearrangeably nonblocking folded Clos network relative to a single crossbar is

13 n^{5} / n^{8} = 13 / n^{3}

.

Table 15 summarizes the crosspoint ratio relative to a single crossbar for traditional rearrangeably nonblocking folded Clos networks and IRNBC networks. The cost ratios for the proposed IRNBC networks relative to traditional networks are

60.00 %

,

55.56 %

, and

53.85 %

for three-stage, five-stage, and seven-stage networks, respectively. The general formula for calculating the IRNBC crosspoint ratio relative to a single crossbar is

(2 s - 1) / n^{s - 1}

, where s is the number of stages and the number of compute nodes is

N = 2 n^{s}

.

Table 15. Crosspoint ratio for rearrangeably nonblocking folded Clos networks.

Table 16 lists the crosspoint ratios relative to a single crossbar for rearrangeably nonblocking folded Clos networks. The ratios are calculated based on the formulas in Table 15.

Table 16. Crosspoint ratio relative to a single crossbar in rearrangeably nonblocking folded Clos networks.

Figure 21 plots the crosspoint ratio relative to a single crossbar for rearrangeably nonblocking folded Clos networks, showing that IRNBC networks have a lower crosspoint cost than traditional rearrangeably nonblocking folded Clos networks.

Figure 21. Crosspoint ratio relative to a single crossbar in rearrangeably nonblocking folded Clos networks.

From the above discussion in this section, it can be seen that the crosspoint ratios of ISNBC and IRNBC networks are lower than those of their corresponding traditional folded Clos networks. The crosspoint ratio of IRNBC networks is lower than that of ISNBC networks because IRNBC networks are rearrangeably nonblocking folded Clos networks that require rearrangements of existing connections to make new connections and ISNBC networks are strictly nonblocking folded Clos networks that do not require rearrangements of existing connections to make new connections.

The proposed ISNBC and IRNBC networks are summarized in Table 17, where the “Node” column shows the number of compute nodes, the “Crossbar” column shows the number of crosspoints in a single crossbar, the “Crosspoint” column shows the number of crosspoints in the proposed network, and the “Switch” column shows the number of switches in the proposed network. Both networks use square crossbar switches.

Table 17. Summary of the proposed ISNBC and IRNBC networks, where n is the number of compute nodes connected to a leaf switch and s is the number of stages.

Figure 22 compares the crosspoint ratios of ISNBC and IRNBC networks for different numbers of compute nodes. Based on this figure, the number of stages can be selected for a given number of compute nodes such that the system has a low crosspoint ratio.

Figure 22. Crosspoint ratios versus the numbers of compute nodes in the proposed networks.

A limitation of the proposed identical nonblocking folded Clos networks is that in ISNBC networks,

3 n \times 3 n

square crossbar switches must be used, whereas in IRNBC networks,

2 n \times 2 n

square crossbar switches must be used, where n is the number of compute nodes connected to a leaf switch. Therefore, the proposed networks cannot use arbitrary

n \times n

square crossbar switches, where

n \in N

, without any unused switch ports. For example, by using practical

16 \times 16

crossbar switches, an IRNBC network can be constructed with no unused switch ports (

16 = 2 n

with

n = 8

). By constructing an ISNBC network with

16 \times 16

crossbar switches, one port on each switch is left unused (

16 = 3 n + 1

with

n = 5

). Again, if an ISNBC network is constructed using

15 \times 15

crossbar switches, there will be no unused switch ports.

5. Conclusions

Nowadays, available switches are square crossbars with the same number of input and output ports. Traditional unidirectional strictly nonblocking Clos networks use switches with different numbers of input and output ports and route packets through a fixed number of stages. Fat trees, the folded version of Clos networks, use differently sized switches at the root and other stages. Recently proposed Clos networks have only two stages and introduce unused ports on the switches.

To address these issues, this paper proposed two new folded Clos variants: an identical strictly nonblocking Clos (ISNBC) network and an identical rearrangeably nonblocking Clos (IRNBC) network. These designs use equally sized square crossbar switches across all stages, eliminate unused switch ports, increase system scalability by accommodating any number of stages, and reduce communication path lengths by supporting shortcut connections. Both ISNBC and IRNBC networks have lower switch crosspoint costs compared to their traditional counterparts. Specifically, ISNBC networks use 46.43% to 87.71% of the crosspoints of traditional strictly nonblocking folded Clos networks, and IRNBC networks use 53.85% to 60.00% of the crosspoints of traditional rearrangeably nonblocking folded Clos networks.

The limitation is that ISNBC networks require the use of

3 n \times 3 n

square crossbar switches, and IRNBC networks require the use of

2 n \times 2 n

square crossbar switches, where n is the number of compute nodes connected to a leaf switch.

Future work should develop load-balancing adaptive routing algorithms and fault-tolerant routing algorithms for the proposed identical strictly and rearrangeably nonblocking folded Clos networks and evaluate performance through simulations.

Funding

This research received no external funding.

Data Availability Statement

Data is contained within the article.

Conflicts of Interest

The author declares no conflicts of interest.

Appendix A. K-Ary N-Fly Butterfly and K-Ary N-Tree Clos Networks

This section describes k-ary n-fly butterfly and k-ary n-tree Clos networks. Figure A1 shows a k-ary n-fly butterfly network with

k = 3

and

n = 3

. It has

k^{n} = 3^{3} =

27 compute nodes. By convention, source and destination nodes are logically drawn from left and right, but physically, two nodes in the same row are the same physical node. The links that connect the ports of switches are unidirectional.

Figure A1. A 3-ary 3-fly butterfly network (unidirectional links).

Generally, a switch in a k-ary n-fly butterfly network is labeled as

⟨ s,

d_{n - 2},

\dots,

d_{0} ⟩

, where s is the stage with

s \in {0, \dots, n - 1}

and

d_{n - 2},

\dots,

d_{0}

is the switch inside stage s with

d_{i} \in {0, \dots, k - 1}

for

0 \leq i \leq n - 2

. In stage s for

0 \leq s \leq n - 2

, a switch

⟨ s, d_{n - 2}, \dots, d_{s + 1}, d_{s}, d_{s - 1}, \dots, d_{0} ⟩

connects to switches

⟨ s + 1, d_{n - 2}, \dots, d_{s + 1}, *, d_{l - 1}, \dots, d_{0} ⟩

, where

* \in {0, \dots, k - 1}

. For example, in Figure A1, switch

⟨ 100 ⟩

in the 3-ary, 3-fly butterfly network connects to switches

⟨ 200 ⟩

,

⟨ 210 ⟩

, and

⟨ 220 ⟩

.

A k-ary n-tree Clos network can be created by combining two k-ary n-fly butterfly networks back to back, where the two back stages are fused [9]. There are

2 n - 1

stages. The

n - 1

stages on the left to the middle stage form an input network, and the

n - 1

stages on the right to the middle stage form an output network. A k-ary n-tree Clos network is a rearrangeably nonblocking network. It solves the problem of a lack of path diversity in butterfly networks.

The input network can route from any source compute node to any middle-stage switch. The output network can route from any middle stage switch to any destination compute node. Like a k-ary n-fly butterfly network, the links in a k-ary n-tree Clos network are also unidirectional. Figure A2 shows a k-ary n-tree Clos network with

k = 3

and

n = 3

. It has

2 n - 1 = 5

stages and

k^{n} = 27

compute nodes.

Figure A2. A 3-ary 3-tree Clos network (unidirectional links).

Appendix B. Deriving the Number of Switches in ISNBC and ISNBC Networks

This appendix presents the derivation of the general formulas for calculating the number of switches for ISNBC and ISNBC networks.

Theorem A1.

The number of switches in an ISNBC network is

T_{s} = (2^{s + 1} - 3) n^{s - 1}

, where n is the number of compute nodes connected to a leaf switch and s is the number of stages.

Proof.

For

s = 1

(one switch stage), there is one switch connecting n compute nodes. Then,

T_{1} = 1

For

s = 2

(two switch stages), there are r switches in the leaf stage and m switches in the root stage, with

m = 2 n

and

r = n + m = 3 n

. Then,

T_{2} = r + m = 3 n + 2 n

For

s = 3

(three switch stages), there are

r = 3 n^{2}

switches in the leaf stage and

m = 2 n

building blocks in the root stage, and each building block has

T_{2}

switches. Then,

T_{3} = r + m T_{2} = 3 n^{2} + 2 n T_{2}

For

s = 4

(four switch stages), there are

r = 3 n^{3}

switches in the leaf stage and

m = 2 n

building blocks in the root stage, and each building block has

T_{3}

switches. Then,

T_{4} = r + m T_{3} = 3 n^{3} + 2 n T_{3}

Generally, for s stages, there are

r = 3 n^{s - 1}

switches in the leaf stage and

m = 2 n

building blocks in the root stage, and each building block has

T_{s - 1}

switches. Then,

T_{s} = r + m T_{s - 1} = 3 n^{s - 1} + 2 n T_{s - 1}

In other words, we have a recursive equation:

\{\begin{matrix} T_{1} & = 1 \\ T_{s} & = 3 n^{s - 1} + 2 n T_{s - 1} \end{matrix}

(A1)

Now, we derive the general formula for

T_{s}

from the recursive equation. Dividing the left and right sides of Equation (A1) by

n^{s - 1}

, we obtain

\frac{T_{s}}{n^{s - 1}} = 3 + 2 \times \frac{T_{s - 1}}{n^{s - 2}}

(A2)

Let

P_{s} = \frac{T_{s}}{n^{s - 1}}

; then, Equation (A2) becomes

P_{s} = 3 + 2 P_{s - 1}

(A3)

To eliminate constant 3, we have

P_{s + 1} = 3 + 2 P_{s}

(A4)

According to (A4) − (A3), we obtain

P_{s + 1} - P_{s} = 2 (P_{s} - P_{s - 1})

(A5)

Let

Q_{s} = P_{s + 1} - P_{s}

; then, Equation (A5) becomes

Q_{s} = 2 Q_{s - 1}

(A6)

It is a geometric progression, and the common ratio is 2. Now, we calculate the first term.

\begin{matrix} P_{s} = \frac{T_{s}}{n^{s - 1}} & \Rightarrow & P_{1} = \frac{T_{1}}{n^{1 - 1}} = T_{1} = 1 \\ P_{s} = \frac{T_{s}}{n^{s - 1}} & \Rightarrow & P_{2} = \frac{T_{2}}{n^{2 - 1}} = \frac{3 n + 2 n}{n} = 5 \\ Q_{s} = P_{s + 1} - P_{s} & \Rightarrow & Q_{1} = P_{2} - P_{1} = 5 - 1 = 4 \end{matrix}

It is well known that the general nth term of a geometric progression is given by

a_{n} = a_{1} \times r^{n - 1}

, where

a_{1}

is the first term, r is the common ratio, and n is the term number. Then, in our case, the sth term of the geometric progression (A6) is given by

Q_{s} = 4 \times 2^{s - 1} = 2^{s + 1}

(A7)

Considering

Q_{s} = P_{s + 1} - P_{s} = 2^{s + 1}

, we have

\begin{matrix} \begin{matrix} P_{s + 1} - P_{s} & = 2^{s + 1} \\ P_{s} - P_{s - 1} & = 2^{s} \\ \dots & \dots \\ P_{3} - P_{2} & = 2^{3} \\ P_{2} - P_{1} & = 2^{2} \end{matrix} \end{matrix}

The sum of the left side of the above equations is

\sum_{i = 1}^{s} (P_{i + 1} - P_{i}) = P_{s + 1} - P_{1} = P_{s + 1} - 1

and the sum of the right side of the above equations is

\sum_{i = 1}^{s} 2^{i + 1} = 2^{s + 2} - 4

Then, we have

P_{s + 1} - 1 = 2^{s + 2} - 4

(left sum = right sum) or

P_{s + 1} = 2^{s + 2} - 3

, that is,

P_{s} = 2^{s + 1} - 3

(A8)

Because

P_{s} = \frac{T_{s}}{n^{s - 1}}

, we have

T_{s} = P_{s} n^{s - 1} = (2^{s + 1} - 3) n^{s - 1}

. □

Theorem A2.

The number of switches in an IRNBC network is

T_{s} = (2 s - 1) n^{s - 1}

, where n is the number of compute nodes connected to a leaf switch and s is the number of stages.

Proof.

For

s = 1

(one switch stage), there is one switch connecting n compute nodes. Then,

T_{1} = 1

For

s = 2

(two switch stages), there are r switches in the leaf stage and m switches in the root stage, with

m = n

and

r = n + m = 2 n

. Then,

T_{2} = r + m = 2 n + n

For

s = 3

(three switch stages), there are

r = 2 n^{2}

switches in the leaf stage and

m = n

building blocks in the root stage, and each building block has

T_{2}

switches. Then,

T_{3} = r + m T_{2} = 2 n^{2} + n T_{2}

For

s = 4

(four switch stages), there are

r = 2 n^{3}

switches in the leaf stage and

m = n

building blocks in the root stage, and each building block has

T_{3}

switches. Then,

T_{4} = r + m T_{3} = 2 n^{3} + n T_{3}

Generally, for s stages, there are

r = 2 n^{s - 1}

switches in the leaf stage and

m = n

building blocks in the root stage, and each building block has

T_{s - 1}

switches. Then,

T_{s} = r + m T_{s - 1} = 2 n^{s - 1} + n T_{s - 1}

In other words, we have a recursive equation:

\{\begin{matrix} T_{1} & = 1 \\ T_{s} & = 2 n^{s - 1} + n T_{s - 1} \end{matrix}

(A9)

Now, we derive the general formula for

T_{s}

from the recursive equation. Dividing the left and right sides of Equation (A9) by

n^{s - 1}

, we obtain

\frac{T_{s}}{n^{s - 1}} = 2 + \frac{T_{s - 1}}{n^{s - 2}}

(A10)

Let

P_{s} = \frac{T_{s}}{n^{s - 1}}

; then, Equation (A10) becomes

P_{s} = 2 + P_{s - 1}

(A11)

It is an arithmetic progression; the common difference is 2, and the first term is 1:

P_{s} = \frac{T_{s}}{n^{s - 1}} \Rightarrow P_{1} = \frac{T_{1}}{n^{1 - 1}} = T_{1} = 1

It is well known that the general nth term of an arithmetic progression is given by

a_{n} = a_{1} + d (n - 1)

, where

a_{1}

is the first term, d is the common difference, and n is the term number. Then, in our case, the sth term of the geometric progression (A11) is given by

P_{s} = 1 + 2 (s - 1) = 2 s - 1

(A12)

Because

P_{s} = \frac{T_{s}}{n^{s - 1}}

, we have

T_{s} = P_{s} n^{s - 1} = (2 s - 1) n^{s - 1}

. □

References

Clos, C. A study of non-blocking switching networks. Bell Syst. Tech. J. 1953, 32, 406–424. [Google Scholar] [CrossRef]
Leiserson, C.E. Fat-trees: Universal networks for hardware-efficient supercomputing. IEEE Trans. Comput. 1985, C-34, 892–901. [Google Scholar] [CrossRef]
Singh, A.; Ong, J.; Agarwal, A.; Anderson, G.; Armistead, A.; Bannon, R.; Boving, S.; Desai, G.; Felderman, B.; Germano, P.; et al. Jupiter Rising: A Decade of Clos Topologies and Centralized Control in Google’s Datacenter Network. ACM SIGCOMM Comput. Commun. Rev. 2015, 45, 183–197. [Google Scholar] [CrossRef]
Liao, X.K.; Pang, Z.B.; Wang, K.F.; Lu, Y.T.; Xie, M.; Xia, J.; Dong, D.Z.; Suo, G. High Performance Interconnect Network for Tianhe System. J. Comput. Sci. Technol. 2015, 30, 259–272. [Google Scholar] [CrossRef]
Fu, H.; Liao, J.; Yang, J.; Wang, L.; Song, Z.; Huang, X.; Yang, C.; Xue, W.; Liu, F.; Qiao, F.; et al. The Sunway TaihuLight supercomputer: System and applications. Sci. China Inf. Sci. 2016, 59, 1869–1919. [Google Scholar] [CrossRef]
Stanzione, D.; West, J.; Evans, R.T.; Minyard, T.; Ghattas, O.; Panda, D.K. Frontera: The Evolution of Leadership Computing at the National Science Foundation. In Proceedings of the Practice and Experience in Advanced Research Computing 2020: Catch the Wave, New York, NY, USA, 27–31 July 2020; PEARC ’20. pp. 106–111. [Google Scholar] [CrossRef]
Stunkel, C.B.; Graham, R.L.; Shainer, G.; Kagan, M.; Sharkawi, S.S.; Rosenburg, B.; Chochia, G.A. The high-speed networks of the Summit and Sierra supercomputers. IBM J. Res. Dev. 2020, 64, 3:1–3:10. [Google Scholar] [CrossRef]
Zhao, C.; Deng, C.; Ruan, C.; Dai, D.; Gao, H.; Li, J.; Zhang, L.; Huang, P.; Zhou, S.; Ma, S.; et al. Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures. In Proceedings of the 52nd Annual International Symposium on Computer Architecture, New York, NY, USA, 21–25 June 2025; pp. 1731–1745. [Google Scholar] [CrossRef]
Abts, D.; Kim, J. High Performance Datacenter Networks: Architectures, Algorithms, and Opportunities; Morgan and Claypool: San Rafael, CA, USA, 2011. [Google Scholar] [CrossRef]
Petrini, F.; Vanneschi, M. k-ary n-trees: High performance networks for massively parallel architectures. In Proceedings of the 11th International Parallel Processing Symposium, Geneva, Switzerland, 1–5 April 1997; pp. 87–93. [Google Scholar] [CrossRef]
Dally, W.J.; Towles, B.P. Principles and Practices of Interconnection Networks; The Morgan Kaufmann Series in Computer Architecture and Design; Elsevier: Amsterdam, The Netherlands, 2004; Available online: https://books.google.co.jp/books?id=oOqpcB5191sC (accessed on 1 July 2025).
Beneš, V.E. Permutation groups, complexes, and rearrangeable connecting networks. Bell Syst. Tech. J. 1964, 43, 1619–1640. [Google Scholar] [CrossRef]
Al-Fares, M.; Loukissas, A.; Vahdat, A. A scalable, commodity data center network architecture. ACM SIGCOMM Comput. Commun. Rev. 2008, 38, 63–74. [Google Scholar] [CrossRef]
Li, Y.; Chu, W. MiKANT: A Mirrored K-Ary N-Tree for Reducing Hardware Cost and Packet Latency of Fat-Tree and Clos Networks. In Proceedings of the 18th IEEE International Conference on Scalable Computing and Communications, Washington, DC, USA, 8–12 October 2018; pp. 1643–1650. [Google Scholar] [CrossRef]
Li, Y.; Chu, W. Fault Tolerance and Packet Latency of Peer Fat-Trees. In Proceedings of the Parallel and Distributed Computing, Applications and Technologies, Sendai, Japan, 7–9 December 2022; pp. 413–425. [Google Scholar] [CrossRef]
Mano, T.; Inoue, T.; Mizutani, K.; Akashi, O. Redesigning the Nonblocking Clos Network to Increase Its Capacity. IEEE Trans. Netw. Serv. Manag. 2023, 20, 2558–2574. [Google Scholar] [CrossRef]
Taka, H.; Inoue, T.; Oki, E. Twisted and Folded Clos-Network Design Model with Two-Step Blocking Probability Guarantee. IEEE Netw. Lett. 2024, 6, 60–64. [Google Scholar] [CrossRef]
Taka, H.; Inoue, T.; Oki, E. Design model of a twisted and folded Clos network with multi-step grouped intermediate switches guaranteeing admissible blocking probability. J. Opt. Commun. Netw. 2024, 16, 328–341. [Google Scholar] [CrossRef]
Oki, E.; Taniguchi, R.; Anazawa, K.; Inoue, T. Design of Multiple-Plane Twisted and Folded Clos Network Guaranteeing Admissible Blocking Probability. IEEE Trans. Netw. Serv. Manag. 2025, 22, 2278–2294. [Google Scholar] [CrossRef]

Figure 1. Proposed strictly nonblocking Clos networks (

m = 2 n

and

r = n + m

). (a) A 3-stage USNBC network with

n = 2

,

m = 2 n = 4

, and

r = n + m = 6

. (b) A 2-stage ISNBC network composed of equally sized

(n + m) \times (m + n) = 6 \times 6

square crossbars.

Figure 2. Merging and expanding an

n \times m

crossbar switch and an

m \times n

crossbar switch to a big square

(n + m) \times (m + n)

crossbar switch for

n = 2

and

m = 2 n = 4

. (a) An

n \times m

crossbar switch and an

m \times n

crossbar switch, each with 8 crosspoints. (b) A square

(n + m) \times (m + n)

crossbar switch with 36 crosspoints. The red line shows the path of

x_{0} \to y_{1}

. (c) Crosspoint states and implementation using two 2-to-1 multiplexers.

Figure 3. A shortcut in an ISNBC network. When a packet is sent from a source (node 1) to a destination (node 2), it does not need to go through the root switch.

Figure 4. A 3-stage USNBC network with

n = 3

,

m = 2 n = 6

, and

r = n + m = 9

.

Figure 5. A 2-stage ISNBC network with

n = 3

,

m = 2 n = 6

, and

r = n + m = 9

composed of equally sized

(n + m) \times (m + n) = 9 \times 9

square crossbar switches.

Figure 6. A five-stage USNBC network with

n = 2

,

m = 2 n = 4

, and

r = 3 n^{2} = 12

. There are

m = 2 n = 4

building blocks (3-stage USNBC, Figure 1a) in the middle stage.

Figure 7. A three-stage ISNBC network with

n = 2

composed of equally sized

(n + m) \times (m + n) = 6 \times 6

square crossbars (folded version of Figure 6).

Figure 8. A 7-stage USNBC network with

n = 2

,

m = 2 n = 4

, and

r = 3 n^{3} = 24

. There are

m = 2 n = 4

building blocks (5-stage USNBC, Figure 6) in the middle stage.

Figure 9. A four-stage ISNBC network with

n = 2

composed of equally sized

(n + m) \times (m + n) = 6 \times 6

square crossbars (folded version of Figure 8).

Figure 10. Proposed rearrangeably nonblocking Clos networks (

m = n

and

r = n + m

). (a) A 3-stage URNBC network with

n = 2

,

m = n = 2

, and

r = n + m = 4

. (b) A 2-stage IRNBC network composed of equally sized

(n + m) \times (m + n) = 4 \times 4

square crossbars.

Figure 11. Blocking and rearrangements in a rearrangeably nonblocking Clos network (

m = n = 2

and

r = n + m = 4

). (a) Two connections (

2 \to 5

and

7 \to 3

) were built. The

1 \to 4

connection cannot be built. (b) The case of the folded version of (a). Note that a bidirectional link consists of two oppositely oriented unidirectional links. (c) Two connections (

2 \to 5

and

7 \to 3

) were built. The

1 \to 4

connection can also be built by the rearrangements of (a). (d) The case of the folded version of (c). (e) Two connections (

2 \to 5

and

7 \to 3

) were built. The

1 \to 4

connection can also be built by the rearrangements of (a). (f) The case of the folded version of (e).

Figure 12. Proposed rearrangeably nonblocking Clos networks (

m = n

and

r = n + m

). (a) A 3-stage URNBC network with

n = 3

,

m = n = 3

, and

r = n + m = 6

. (b) A 2-stage IRNBC network composed of equally sized

2 n \times 2 n = 6 \times 6

square crossbars.

Figure 13. A 5-stage URNBC network with

n = 2

,

m = n = 2

, and

r = 2 n^{2} = 8

. There are

m = n = 2

building blocks (3-stage URNBC, Figure 10a) in the middle stage.

Figure 14. A 3-stage IRNBC network with

n = 2

composed of equally sized

2 n \times 2 n = 4 \times 4

square crossbars (folded version of Figure 13).

Figure 15. A 7-stage URNBC network with

n = 2

,

m = n = 2

, and

r = 2 n^{3} = 16

. There are

m = n = 2

building blocks (5-stage URNBC, Figure 13) in the middle stage.

Figure 16. A 4-stage IRNBC network with

n = 2

composed of equally sized

2 n \times 2 n = 4 \times 4

square crossbars (folded version of Figure 15).

Figure 17. Crosspoint ratio relative to a single crossbar in unidirectional strictly nonblocking Clos networks.

Figure 18. Crosspoint ratio relative to a single crossbar in strictly nonblocking folded Clos networks.

Figure 19. Crosspoint ratio relative to a single crossbar in strictly nonblocking folded Clos networks.

Figure 20. Crosspoint ratio relative to a single crossbar in unidirectional rearrangeably nonblocking Clos networks.

Figure 21. Crosspoint ratio relative to a single crossbar in rearrangeably nonblocking folded Clos networks.

Figure 22. Crosspoint ratios versus the numbers of compute nodes in the proposed networks.

Table 1. Proposed nonblocking Clos networks.

Strictly or Rearrangeably	Network	Stage, Unfolded or Folded
Strictly nonblocking	USNBC	3-stage, 5-stage, and 7-stage Clos networks
Strictly nonblocking	ISNBC	2-stage, 3-stage, and 4-stage folded Clos networks
Rearrangeably nonblocking	URNBC	3-stage, 5-stage, and 7-stage Clos networks
Rearrangeably nonblocking	IRNBC	2-stage, 3-stage, and 4-stage folded Clos networks

Table 2. Comparison of three-stage unidirectional strictly nonblocking Clos networks.

	m	Ingress Switch	Middle Switch	Egress Switch	N
Traditional	$2 n - 1$	$[n \times (2 n - 1)] \times n$	$[n \times n] \times (2 n - 1)$	$[(2 n - 1) \times n] \times n$	$n^{2}$
USNBC	$2 n$	$[n \times 2 n] \times 3 n$	$[3 n \times 3 n] \times 2 n$	$[2 n \times n] \times 3 n$	$3 n^{2}$

Table 3. Comparison of two-stage strictly nonblocking folded Clos networks.

	m	Leaf Switch	Root Switch	N
Traditional	$2 n - 1$	$[(3 n - 1) \times (3 n - 1)] \times n$	$[n \times n] \times (2 n - 1)$	$n^{2}$
ISNBC	$2 n$	$[3 n \times 3 n] \times 3 n$	$[3 n \times 3 n] \times 2 n$	$3 n^{2}$

Table 4. The numbers of compute nodes and switches in ISNBC networks.

	Number of Compute Nodes			Number of Switches
n	2-Stage	3-Stage	4-Stage	2-Stage	3-Stage	4-Stage	Crossbar
2	12	24	48	10	52	232	$6 \times 6$
3	27	81	243	15	117	783	$9 \times 9$
4	48	192	768	20	208	1856	$12 \times 12$
5	75	375	1875	25	325	3625	$15 \times 15$
6	108	648	3888	30	468	6264	$18 \times 18$
7	147	1029	7203	35	637	9947	$21 \times 21$
8	192	1536	12,288	40	832	14,848	$24 \times 24$
9	243	2187	19,683	45	1053	21,141	$27 \times 27$
10	300	3000	30,000	50	1300	29,000	$30 \times 30$

Table 5. Comparison of 3-stage unidirectional rearrangeably nonblocking Clos networks.

	m	Ingress Switch	Middle Switch	Egress Switch	N
Traditional	n	$[n \times n] \times n$	$[n \times n] \times n$	$[n \times n] \times n$	$n^{2}$
URNBC	n	$[n \times n] \times 2 n$	$[2 n \times 2 n] \times n$	$[n \times n] \times 2 n$	$2 n^{2}$

Table 6. Comparison of 2-stage rearrangeably nonblocking folded Clos networks.

	m	Leaf Switch	Root Switch	N
Traditional	n	$[2 n \times 2 n] \times n$	$[n \times n] \times n$	$n^{2}$
IRNBC	n	$[2 n \times 2 n] \times 2 n$	$[2 n \times 2 n] \times n$	$2 n^{2}$

Table 7. The numbers of compute nodes and switches in the IRNBC networks.

	Number of Compute Nodes			Number of Switches
n	2-Stage	3-Stage	4-Stage	2-Stage	3-Stage	4-Stage	Crossbar
2	8	16	32	6	20	56	$4 \times 4$
3	18	54	162	9	45	189	$6 \times 6$
4	32	128	512	12	80	448	$8 \times 8$
5	50	250	1250	15	125	875	$10 \times 10$
6	72	432	2592	18	180	1512	$12 \times 12$
7	98	686	4802	21	245	2401	$14 \times 14$
8	128	1024	8192	24	320	3584	$16 \times 16$
9	162	1458	13,122	27	405	5103	$18 \times 18$
10	200	2000	20,000	30	500	7000	$20 \times 20$
11	242	2662	29,282	33	605	9317	$22 \times 22$
12	288	3456	41,472	36	720	12,096	$24 \times 24$
13	338	4394	57,122	39	845	15,379	$26 \times 26$
14	392	5488	76,832	42	980	19,208	$28 \times 28$
15	450	6750	101,250	45	1125	23,625	$30 \times 30$

Table 8. Nonblocking Clos networks for cost evaluations.

Strictly or Rearrangeably	Unidirectional or Bidirectional	Network	Stage, Unfolded or Folded
Strictly nonblocking	Unidirectional	USNBC	3-, 5-, and 7-stage Clos networks
	Unidirectional	Traditional	3-, 5-, and 7-stage Clos networks
	Bidirectional	ISNBC	2-, 3-, and 4-stage folded Clos networks
	Bidirectional	Traditional	2-, 3-, and 4-stage folded Clos networks
Rearrangeably nonblocking	Unidirectional	URNBC	3-, 5-, and 7-stage Clos networks
	Unidirectional	Traditional	3-, 5-, and 7-stage Clos networks
	Bidirectional	IRNBC	2-, 3-, and 4-stage folded Clos networks
	Bidirectional	Traditional	2-, 3-, and 4-stage folded Clos networks

Table 9. Crosspoint ratio for unidirectional strictly nonblocking Clos networks.

Network	3-Stage	5-Stage	7-Stage
Traditional	$3 (2 n - 1) / n^{2}$	$(8 n - 3) (2 n - 1) / n^{4}$	$(18 n^{2} - 14 n + 3) (2 n - 1) / n^{6}$
USNBC	$10 / (3 n)$	$24 / (3 n^{2})$	$52 / (3 n^{3})$

Table 10. The numbers of crosspoints in ISNBC networks.

	2-Stage			3-Stage			4-Stage
n	Node	Crossbar	ISNBC	Node	Crossbar	ISNBC	Node	Crossbar	ISNBC
2	12	144	360	24	576	1872	48	2304	8352
3	27	729	1215	81	6561	9477	243	59,049	63,423
4	48	2304	2880	192	36,864	29,952	768	589,824	267,264
5	75	5625	5625	375	140,625	73,125	1875	3,515,625	815,625
6	108	11,664	9720	648	419,904	151,632	3888	15,116,544	2,029,536
7	147	21,609	15,435	1029	1,058,841	280,917	7203	51,883,209	4,386,627
8	192	36,864	23,040	1536	2,359,296	479,232	12,288	150,994,944	8,552,448
9	243	59,049	32,805	2187	4,782,969	767,637	19,683	387,420,489	15,411,789
10	300	90,000	45,000	3000	9,000,000	1,170,000	30,000	900,000,000	26,100,000

Table 11. Crosspoint ratio for strictly nonblocking folded Clos networks.

Network	2-Stage	3-Stage	4-Stage
Traditional	$(11 n^{2} - 7 n + 1) / n^{3}$	$(31 n^{3} - 31 n^{2} + 10 n - 1) / n^{5}$	$(71 n^{4} - 99 n^{3} + 52 n^{2} - 12 n + 1) / n^{7}$
ISNBC	$5 / n$	$13 / n^{2}$	$29 / n^{3}$

Table 12. Crosspoint ratio relative to a single crossbar in strictly nonblocking folded Clos networks.

	ISNBC			Traditional
n	2-Stage	3-Stage	4-Stage	2-Stage	3-Stage	4-Stage
2	2.500000	3.250000	3.625000	3.875000	4.468750	4.132812
3	1.666667	1.444444	1.074074	2.925926	2.415638	1.605396
4	1.250000	0.812500	0.453125	2.328125	1.491211	0.770569
5	1.000000	0.520000	0.232000	1.928000	1.007680	0.425485
6	0.833333	0.361111	0.134259	1.643519	0.725180	0.258748
7	0.714286	0.265306	0.084548	1.431487	0.546379	0.168757
8	0.625000	0.203125	0.056641	1.267578	0.426239	0.116044
9	0.555556	0.160494	0.039781	1.137174	0.341699	0.083163
10	0.500000	0.130000	0.029000	1.031000	0.279990	0.061608

Table 13. Crosspoint ratio for unidirectional rearrangeably nonblocking Clos networks.

Network	3-Stage	5-Stage	7-Stage
Traditional	$3 / n$	$5 / n^{2}$	$7 / n^{3}$
URNBC	$2 / n$	$3 / n^{2}$	$4 / n^{3}$
URNBC/Traditional	$66.67 %$	$60.00 %$	$57.14 %$

Table 14. The number of crosspoints in IRNBC networks.

	2-Stage			3-Stage			4-Stage
n	Node	Crossbar	IRNBC	Node	Crossbar	IRNBC	Node	Crossbar	IRNBC
2	8	64	96	16	256	320	32	1024	896
3	18	324	324	54	2916	1620	162	26,244	6804
4	32	1024	768	128	16,384	5120	512	262,144	28,672
5	50	2500	1500	250	62,500	12,500	1250	1,562,500	87,500
6	72	5184	2592	432	186,624	25,920	2592	6,718,464	217,728
7	98	9604	4116	686	470,596	48,020	4802	23,059,204	470,596
8	128	16,384	6144	1024	1,048,576	81,920	8192	67,108,864	917,504
9	162	26,244	8748	1458	2,125,764	131,220	13,122	172,186,884	1,653,372
10	200	40,000	12,000	2000	4,000,000	200,000	20,000	400,000,000	2,800,000

Table 15. Crosspoint ratio for rearrangeably nonblocking folded Clos networks.

Network	2-Stage	3-Stage	4-Stage
Traditional	$5 / n$	$9 / n^{2}$	$13 / n^{3}$
IRNBC	$3 / n$	$5 / n^{2}$	$7 / n^{3}$
IRNBC/Traditional	$60.00 %$	$55.56 %$	$53.85 %$

Table 16. Crosspoint ratio relative to a single crossbar in rearrangeably nonblocking folded Clos networks.

	IRNBC			Traditional
n	2-Stage	3-Stage	4-Stage	2-Stage	3-Stage	4-Stage
2	1.500000	1.250000	0.875000	2.500000	2.250000	1.625000
3	1.000000	0.555556	0.259259	1.666667	1.000000	0.481481
4	0.750000	0.312500	0.109375	1.250000	0.562500	0.203125
5	0.600000	0.200000	0.056000	1.000000	0.360000	0.104000
6	0.500000	0.138889	0.032407	0.833333	0.250000	0.060185
7	0.428571	0.102041	0.020408	0.714286	0.183673	0.037901
8	0.375000	0.078125	0.013672	0.625000	0.140625	0.025391
9	0.333333	0.061728	0.009602	0.555556	0.111111	0.017833
10	0.300000	0.050000	0.007000	0.500000	0.090000	0.013000

Table 17. Summary of the proposed ISNBC and IRNBC networks, where n is the number of compute nodes connected to a leaf switch and s is the number of stages.

Network	Node	Crossbar	Crosspoint	Switch	Switch Size
ISNBC	$3 n^{s}$	$9 n^{2 s}$	$9 (2^{s + 1} - 3) n^{s + 1}$	$(2^{s + 1} - 3) n^{s - 1}$	$3 n \times 3 n$
IRNBC	$2 n^{s}$	$4 n^{2 s}$	$4 (2 s - 1) n^{s + 1}$	$(2 s - 1) n^{s - 1}$	$2 n \times 2 n$

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Design of Identical Strictly and Rearrangeably Nonblocking Folded Clos Networks with Equally Sized Square Crossbars

Abstract

1. Introduction

3. Proposed Identical Nonblocking Folded Clos Networks

3.1. Proposed Identical Strictly NonBlocking Folded Clos (ISNBC) Networks

3.1.1. Two-Stage ISNBC Networks

3.1.2. Three-Stage ISNBC Networks

3.1.3. Four-Stage ISNBC Networks

3.2. Proposed Identical Rearrangeably NonBlocking Folded Clos (IRNBC) Networks

3.2.1. Two-Stage IRNBC Networks

3.2.2. Three-Stage IRNBC Networks

3.2.3. Four-Stage IRNBC Networks

4. Cost Evaluations

4.1. Cost Evaluations of Strictly Nonblocking Clos Networks

4.1.1. Cost Evaluations of USNBC Networks

4.1.2. Cost Evaluations of ISNBC Networks

4.2. Cost Evaluations of Rearrangeably Nonblocking Clos Networks

4.2.1. Cost Evaluations of URNBC Networks

4.2.2. Cost Evaluations of IRNBC Networks

5. Conclusions

Funding

Data Availability Statement

Conflicts of Interest

Appendix A. K-Ary N-Fly Butterfly and K-Ary N-Tree Clos Networks

Appendix B. Deriving the Number of Switches in ISNBC and ISNBC Networks

References

Article Metrics

Citations

Article Access Statistics

Design of Identical Strictly and Rearrangeably Nonblocking Folded Clos Networks with Equally Sized Square Crossbars

Abstract

1. Introduction

2. Related Works

2.1. Traditional Nonblocking Clos Networks

2.2. K-Ary N-Tree Clos Networks

2.3. Mirrored and Peer K-Ary N-Tree Networks

2.4. Twisted-and-Folded Clos Networks

3. Proposed Identical Nonblocking Folded Clos Networks

3.1. Proposed Identical Strictly NonBlocking Folded Clos (ISNBC) Networks

3.1.1. Two-Stage ISNBC Networks

3.1.2. Three-Stage ISNBC Networks

3.1.3. Four-Stage ISNBC Networks

3.2. Proposed Identical Rearrangeably NonBlocking Folded Clos (IRNBC) Networks

3.2.1. Two-Stage IRNBC Networks

3.2.2. Three-Stage IRNBC Networks

3.2.3. Four-Stage IRNBC Networks

4. Cost Evaluations

4.1. Cost Evaluations of Strictly Nonblocking Clos Networks

4.1.1. Cost Evaluations of USNBC Networks

4.1.2. Cost Evaluations of ISNBC Networks

4.2. Cost Evaluations of Rearrangeably Nonblocking Clos Networks

4.2.1. Cost Evaluations of URNBC Networks

4.2.2. Cost Evaluations of IRNBC Networks

5. Conclusions

Funding

Data Availability Statement

Conflicts of Interest

Appendix A. K-Ary N-Fly Butterfly and K-Ary N-Tree Clos Networks

Appendix B. Deriving the Number of Switches in ISNBC and ISNBC Networks

References

Article Metrics

Citations

Article Access Statistics