Next Article in Journal
Impact of Air- and Freeze-Drying Methods on Total Phenolic Content and Antioxidant Activity of Fistulina antarctica and Ramaria patagonica Fructification
Previous Article in Journal
Nephrotoxicity Development of a Clinical Decision Support System Based on Tree-Based Machine Learning Methods to Detect Diagnostic Biomarkers from Genomic Data in Methotrexate-Induced Rats
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

LightR: A Fault-Tolerant Wavelength-Routed Optical Networks-on-Chip Topology †

Chair of Electronic Design Automation, Technical University of Munich, Arcisstr. 21, 80333 Munich, Germany
*
Author to whom correspondence should be addressed.
This paper is an extended version of our paper published in proceedings of the 2021 IEEE/ACM Asia and South Pacific Design Automation Conference (ASP-DAC), Tokyo, Japan, 18–21 January 2021.
Appl. Sci. 2023, 13(15), 8871; https://doi.org/10.3390/app13158871
Submission received: 20 May 2023 / Revised: 22 July 2023 / Accepted: 24 July 2023 / Published: 1 August 2023
(This article belongs to the Section Optics and Lasers)

Abstract

:
Optical networks-on-chip (NoCs) have emerged as a next-generation solution to overcome the limitations of electrical NoCs. In particular, wavelength-routed optical networks-on-chip (WRONoCs) are well known for their high bandwidth and ultra-low signal delay. Despite these advantages, WRONoCs are challenged by reliability concerns, because the main components in WRONoCs, i.e., microring resonators (MRRs), are susceptible to fabrication inaccuracies. When an MRR along a signal path is defective, the signal transmitted on that path will fail to reach its designated destination, which leads to transmission errors and data loss. In this work, we propose a fault-tolerant WRONoC topology, LightR, which provides two independent signal paths for each master–slave pair to tolerate defective MRRs. Moreover, we minimize the MRR usage to enhance the reliability of the WRONoCs. The experimental results show that LightR is able to provide a higher reliability with a modest MRR usage, insertion loss, and crosstalk noise. As the fault rate or the network size grows, the advantages of LightR in terms of the fault tolerance become even more significant. For example, when considering the 3% fault rate of MRRs and a 64-master × 64-slave network, LightR decreases the number of error signals by 85–90% compared to the typical state-of-the-art WRONoC topologies.

1. Introduction

Stimulated by recent breakthroughs in silicon photonics, optical networks-on-chip (ONoCs) have emerged as a next-generation solution to overcome the bandwidth and energy limitations of the electrical interconnects in multiprocessor system-on-chip (MPSoC) [1,2]. As the name suggests, ONoCs use optical signals to transmit data [2]. Taking advantage of the wavelength-division multiplexing (WDM) technology and the ultra-low propagation delay of light in silicon, ONoCs promise to meet the high bandwidth demands while maintaining low latency and power [3].
Current ONoC architectures can be classified into two categories: control-networks-based and wavelength-routed [3]. On control-networks-based ONoCs, before a sender (master) can transmit data to a receiver (slave), a signal path needs to be reserved through an additional control network [4,5]. On the other hand, wavelength-routed ONoCs (WRONoCs) fix collision-free signal paths between all master–slave pairs at the time of the design so that all masters can communicate to all slaves simultaneously [6,7,8,9,10]. Therefore, WRONoCs are free from the energy and latency overhead for arbitration and are gaining increasing research interest.
Typically, the WRONoC design is divided into two consecutive steps: a topological and a physical design. A WRONoC topology specifies the interconnection and configuration of the network components, and a physical tool implements the interconnection of the input topology on a layout plane [11]. Figure 1a shows a simple WRONoC topology, where one master sends signals to three slaves. The signals sent from the master are modulated on three different wavelengths, represented by the blue, red, and green arrows. The signals travel along the same waveguide until they are demultiplexed by different optical switching elements (OSEs). Figure 1b shows one typical structure of the 2-input × 2-output OSEs, called a crossing switching element (CSE). A 2 × 2 CSE consists of a pair of orthogonal waveguides and two microring resonators (MRRs) configured to be on-resonance with the wavelength  λ i . As shown in Figure 1c, when signals on  λ i  enter the CSE, they are coupled to the MRR and experience a 90  change in their propagation directions. On the other hand, when signals on the wavelengths other than  λ i  enter the CSE, they will pass through the CSE and keep their propagation directions, as shown in Figure 1d.
Due to the complexity of the manufacturing process, MRRs are susceptible to fabrication errors [12,13,14]. Defective MRRs can cause malfunctions and even data loss in WRONoCs, which lowers the fabrication yield. For example, if the MRR in OSE1 shown in Figure 2 is defective and fails to resonate with its designed wavelength  λ i , the signal on  λ i  will fail to reach Slave1, causing data loss. Therefore, enhancing the reliability of WRONoCs is of great importance.
However, current WRONoC topologies have rarely considered fault tolerance. For most WRONoC topologies, only one fixed signal path is reserved for each master–slave pair that requires communication. When an MRR along the designated signal path is defective, there is no available resource to re-arrange a new path for the signal. To the best of our knowledge, RobustONoC [15] was the only work that considered fault tolerance in the WRONoC topology design. For a given topology, RobustONoC modifies the topological structure by inserting backup MRRs and waveguides to enable the deviated signals to return to their planned paths. However, RobustONoC requires roughly twice the number of MRRs in the original topology as backups. More importantly, RobustONoC assumes that only one single fault will appear in each WRONoC topology, regardless of the network size, which is rather unrealistic. In fact, there can be multiple malfunctioning MRRs, especially for large networks consisting of many MRRs. Therefore, there remains a need for a more realistic fault model, as well as a scalable and fault-tolerant WRONoC topology.
We summarize the main contributions of this paper as follows:
  • We propose a new WRONoC fault model addressing the different fault rates of MRRs, thereby removing the impractical assumption in the state-of-the-art fault model that only one malfunctioning MRR will exist regardless of the size of the networks.
  • We propose the first fault-tolerant WRONoC topology, LightR, as an extension of the Light topology [16]. Compared to Light, which reserves only one signal path for each master–slave pair, LightR enhances reliability by providing two independent signal paths to each master–slave pair that requires communication.
  • We greatly reduce the MRR usage compared to the state-of-the-art fault-tolerant WRONoC design method.
We evaluate the performance of LightR by comparing it to three state-of-the-art topologies,  λ -router [17], GWOR [18], and Light [16], and to the state-of-the-art fault-tolerant design method for topologies, RobustONoC [15]. The experimental results show that LightR provides higher reliability with a modest MRR usage. For example, considering the 3% fault rate for a 64-node network, LightR decreases the number of erroneous communications by 85–90% compared to the state-of-the-art topologies.

2. Background

2.1. Parallel Switching Elements

In ONoCs, OSEs have various structures. Aside from the CSE, shown in Figure 1b, another typical structure of OSEs is called the parallel switching element (PSE). In a PSE, an MRR is placed between a pair of parallel waveguides so that signals entering the PSE will experience a 180-degree direction change [16]. Figure 3 illustrates the working mechanism of a PSE. Compared to the CSE, where two MRRs are placed close to a pair of crossed waveguides, a PSE avoids the crossing loss and crosstalk noise generated by the waveguide crossing and requires only one MRR to route the signals among two inputs and two outputs. Considering these advantages, a PSE is considered as an appealing component to construct WRONoCs [2].

2.2. Performance Factors

In ONoCs, insertion loss and crosstalk noise are two important performance factors, which can decrease the signal-to-noise ratio (SNR) and cause power penalties [19].
Insertion loss is the power loss of signals. Typically, in a WRONoC topology, the insertion loss of a signal can be considered as the summation of three main losses [6,16]: the crossing loss that depends on the number of waveguide crossings that the signal passes; the drop loss when the signal is on-resonance with an MRR; the through loss when the signal passes through an off-resonance MRR. In particular, the worst-case insertion loss of a WRONoC topology is the maximum insertion loss of all signals, which determines the power consumption of the network.
Crosstalk noise refers to the noise signals generated at MRRs and waveguide crossings [19]. As shown in Figure 4, when a signal passes through a waveguide crossing or an off-resonance MRR, or when a signal is on-resonance with an MRR, a portion of the signal power will leak to other outputs and become noise. Noise generated by the original signals is denoted as the first-order noise and has the same wavelength as the original signals [20]. When a noise signal reaches a slave, it will decrease the SNR of the desired signals on the same wavelength. Specifically, the SNR of a signal on wavelength  λ i  is calculated as  10 l o g P o u t p u t λ i P n o i s e λ i , where  P o u t p u t λ i  denotes the output power of the desired signal, and  P n o i s e λ i  denotes the power of the noise signals [20]. For the calculation of the SNR, we only consider the first-order noise, since the power of the noise generated by other noise signals is relatively small.

2.3. MRR Faults and Signal Faults

An MRR fault can either be temporary or permanent [13]. Temporary faults are caused by environmental changes. For example, a change of 1  C  in temperature can shift the resonant wavelength of an MRR by  0.1   n m , which causes the MRR to resonate with a different wavelength than was intended [13,14]. Some researchers have worked on that problem and proposed some ONoC resilience techniques, such as trimming [21], to correct the faults. On the other hand, permanent faults are caused by fabrication errors. For example, some changes in the physical dimensions, e.g., the radius of the MRRs, the width of the waveguides, and the thickness of the wafer, can affect the resonant wavelengths of the MRRs [14,22]. These permanent faults cannot be corrected by those resilience techniques. Therefore, permanent faults, which can significantly lower the fabrication yield of WRONoCs, should be carefully considered in the design phase, not as an afterthought.
When an MRR is permanently faulty, its resonant wavelength deviates from its designated wavelength, which causes two types of signal faults: stuck-at-zero (s-a-0) and stuck-at-one (s-a-1). The s-a-0 signal fault is that a signal fails to be coupled to the MRR, which is designed to be on-resonance with the signal. As shown in Figure 5a, the MRRs do not resonate with the designated wavelength, and thus the signals cannot be coupled to the MRRs and suffer the s-a-0 faults. On the other hand, a signal suffers an s-a-1 fault when it is coupled to an MRR that is not designated to be on-resonance with the signal. For example, the MRRs designed to resonate with  λ i  are now resonant with another wavelength  λ j , and the signals on  λ j  are coupled to the MRRs as shown in Figure 5b. When a signal suffers either an s-a-0 or s-a-1 fault, it deviates from its planned propagation direction and may fail to reach its designated destination. As a result, the data carried on the signals are lost, which raises the reliability concern of WRONoCs.

2.4. State-of-the-Art WRONoC Topologies

For each master–slave pair that requires communication, state-of-the-art WRONoC topologies construct one fixed signal path [6,16,17,18,23]. In addition, most topologies, such as  λ -router [17], Snake [23], and GWOR [18], use CSEs with two identical MRRs, where each MRR is designed to be on-resonance with one signal, which corresponds to one signal path.
Figure 6a shows the logic scheme of a  4 × 4   λ -router, which consists of six CSEs. For example, the MRRs of the top-left CSE are on-resonance with the signals from  m 1  to  s 3  and from  m 2  to  s 4 , respectively.
Figure 6b shows the logic scheme of a  4 × 4  Hash [16], which uses PSEs instead of CSEs. In the Hash, the MRR of a PSE is configured to be on-resonance with two signals. For example, the signals from  m 1  to  s 4  and from  m 2  to  s 3 , represented by red lines in Figure 6b, are coupled to the MRR of the top-left PSE.

3. LightR: An  N × N  Scalable and Fault-Tolerant WRONoC Topology

To enhance the reliability of WRONoCs, we propose a scalable and fault-tolerant topology: LightR using the HashR as a basic building block. Inspired by the Hash, we propose the HashR, which can be considered a  4 × 4  WRONoC topology that reserves two independent paths between a master–slave pair that requires communication. We denote each IP-core consisting of a master and a slave as a node and apply a common assumption that each node communicates with all other nodes except for itself [6,16,18,24]. In other words, the HashR considers 12 communications among four nodes. In Section 3.1, we introduce the logic scheme of the HashR. To support communications in any network size, we used the HashR to construct an  N × N  LightR by connecting the waveguides and configuring the wavelengths of the MRRs. We introduce the methods of waveguide connections and wavelength configuration in Section 3.2 and Section 3.3, respectively.

3.1. Logic Scheme of HashR

As shown in Figure 7, the HashR consists of eight PSEs configured on four wavelengths. For each communication from a master to a slave, the HashR reserves two independent signal paths, i.e., the signal paths are constructed with different routing resources. For example, Figure 7 shows two signal paths reserved for the communication from  m 1  to  s 4 . Specifically, two signals on  λ 1  and  λ 2 , represented by red and sky-blue lines, are coupled to the upper-left MRRs and reach  s 4 . When an MRR is defective along a signal path, the master can still communicate with the slave using another signal path. For example, the top-left MRR, highlighted by a black dashed square in Figure 8a, is defective and resonant with  λ 3 . The signals on  λ 1  and  λ 3  from  m 1  suffer the s-a-0 and the s-a-1 fault, respectively. Nevertheless,  m 1  can still communicate with  s 2  and  s 4  using the signals on  λ 4  and  λ 2 , respectively, as shown in Figure 8b.
The matrix in Figure 7 shows the wavelengths used by all communications. For example,  m 1  communicates with  s 2  using wavelengths  λ 3  and  λ 4 . The signals from  m 1  follow the waveguide connected to  s 3  until it is coupled to the bottom-left MRRs as shown in Figure 7. If a master is directly connected to a slave by a waveguide, the wavelengths of the signal can be any wavelength except for the resonant wavelengths of all the MRRs along the path. For example,  m 1  communicates with  s 3  on wavelengths  λ 5  and  λ 6 .
The HashR supports 24 signal paths for the 12 communications among four nodes using only eight MRRs, which is the least possible MRR usage. Specifically, among the 24 signal paths, the HashR directly connects two nodes with waveguides and supports eight signal paths that do not rely on MRRs, such as the paths reserved for the signals on  λ 5  and  λ 6  from  m 1  to  s 3 ; moreover, for the 16 signal paths that rely on MRRs, the HashR uses only eight MRRs to form eight PSEs, each of which routes two signals, and thus reduces the MRR usage by half compared to the CSEs with two identical MRRs.

3.2. Waveguide Connections

To construct an  N × N  LightR, we need  N 2 ( N 2 1 ) / 2  HashRs [16]. Figure 9 shows the general structure of an  N × N  LightR, which can be formed by the following steps:
(1)
Place  N 2 1 ( k 1 )  HashRs horizontally in the k-th row with  1 k N 2 1 . Connect the left ports of each HashR with its left neighbor.
(2)
Connect the bottom ports of the HashR to its bottom neighbor except for the one at the rightmost end of each row. Connect the bottom ports of the HashR at the rightmost end of each row to its bottom left neighbor.
(3)
Connect the upper ports of the HashRs in the first row to the ports  m 1 s 1 m 2 s 2 m 3 s 3 , …,  m N 2 1  and  s N 2 1 , sequentially. If the number of nodes is even, then connect the right ports of the HashR at the rightmost end in the first row to  m N 2  and  s N 2 .
(4)
Connect the left ports of the HashRs in the first column to the ports  m N s N m N 1 s N 1 , …,  m N + 1 2 + 2 s N + 1 2 + 2 m N + 1 2 + 1 s N + 1 2 + 1 , sequentially. Connect the bottom input and output of the HashR in the last row to  m N + 1 2  and  s N + 1 2 .
With these steps, this structure can be expanded to any size.

3.3. Wavelength Configuration

After connecting the waveguides, we configured the wavelengths of all the MRRs in the LightR. As introduced in Section 3.1, each HashR uses eight MRRs on four different wavelengths, which can be regarded as a wavelength set denoted as  Λ . For example, in the HashR shown in Figure 7, the wavelength set  Λ  contains four wavelengths:  λ 1 λ 2 λ 3 , and  λ 4 . In this case, the task to assign wavelengths to each MRR is converted into the task to assign a wavelength set to each HashR.
To avoid data conflict, we assign the wavelengths of the signals from the same masters or to the same slaves with different wavelengths and propose a simple wavelength configuration approach. For an  N × N  LightR, we first construct a ( N 2 1 ) × ( N 2 1 ) wavelength-set matrix, where each entry represents a wavelength set ( Λ ). After that, we fill the matrix column by column by repeatedly iterating over an array from 1 to  N 2 . In this way, we ensure that entries in the same row or the same column of the matrix must be different, such that every master will send signals to different slaves on different wavelengths, and every slave will receive signals from different masters on different wavelengths. Hence, data conflict can be avoided. At last, we configure the HashRs according to the matrix.
Taking a network with eight nodes as an example, we need  8 2 ( 8 2 1 ) / 2 = 6  HashRs to support the communications among the nodes. To configure the resonant wavelengths of the MRRs in the HashRs, we first construct a  3 × 3  wavelength-set matrix and four wavelength sets [ Λ 1 , Λ 2 , Λ 3 , Λ 4 ]. We fill the first column with  Λ 1 Λ 2 , and  Λ 3  as shown in Figure 10a and fill  Λ 4  to the first entry in the second column. Then, we begin the second iteration from  Λ 1  again and fill  Λ 1 Λ 2  to the remaining two entries in the second column shown in Figure 10b. After repeating this step for the third column, we have a filled  3   ×   3  wavelength-set matrix, as shown in Figure 10c. Since only six HashRs are required for this topology, the entries below the counter-diagonal are replaced by 0, as shown in Figure 10d.
To construct an  8 × 8  LightR, we place and connect the six HashRs according to the steps stated in Section 3.2. Assuming that  Λ 1  = ( λ 1 , λ 2 , λ 3 , λ 4 ),  Λ 2  = ( λ 5 , λ 6 , λ 7 , λ 8 ),  Λ 3  = ( λ 9 , λ 10 λ 11 , λ 12 ), and  Λ 4  = ( λ 13 λ 14 λ 15 λ 16 ), we configure the MRRs of the  8 × 8  LightR with the 3 × 3 wavelength-set matrix, shown in Figure 10d. The  8 × 8  LightR topology is presented in Figure 11.
Specifically, if  m i  is directly connected to  s j  by a waveguide, the signals should be off-resonance with the MRRs along the waveguide. For example,  m 1  can communicate with  s 5  using the signals on the wavelengths  λ 13 λ 14 λ 15 , and  λ 16 , which are off-resonance with the MRRs along the waveguide connecting  m 1  and  s 5 . A PSE in LightR is formed by two waveguides: one connects  m i  and  s j , and the other one connects  m p  and  s q . Then, the wavelengths of the signals from  m i  to  s q  and from  m p  to  s j  are equal to the resonant wavelength of the MRR in the PSE. For example, the top-left PSEs are formed by the waveguide that connects  m 1  to  s 5  and the waveguide that connects  m 4  to  s 8 . Therefore, the wavelengths of the signal paths from  m 1  to  s 8  and from  m 4  to  s 5  are set to  λ 1  and  λ 2 . We present the wavelength assignment results of the  8 × 8  LightR in Table 1.
We note that the master–slave pairs that are directly connected by waveguides communicate using four wavelengths. Assuming that a wavelength carries one-bit data, if no errors occur along the signal paths, the master can send four-bit data to the slave at a time. That provides a higher bandwidth than the topologies using one wavelength for each communication. For applications that require high-bandwidth communication, LightR can thus be considered an appealing option.

4. Experimental Results

To evaluate the performance of the LightR, we compared it to three state-of-the-art WRONoC topologies,  λ -router [17], GWOR [18], and Light [16], for two aspects: reliability and efficiency. We present the comparison in Section 4.1. Then, we compared the LightR with the RobustONoC [15], which inserts backup MRRs to deal with the MRR faults, and we discuss their performances in Section 4.2.

4.1. Comparison with the State-of-the-Art WRONoC Topologies

In Section 4.1.1, we evaluate the reliability of the four topologies for the networks with different fault rates of MRRs. We assigned a certain number of defective MRRs in our fault model based on a fault rate. If a master failed to communicate with a slave due to the MRR faults, we considered that the corresponding communication was an error communication. In each network, we counted the number of error communications and compared the results. Then, in Section 4.1.2, we evaluate the efficiency of the LightR by comparing the MRR usage, the insertion loss, and the SNR results to the other three topologies.

4.1.1. Discussion: Reliability

We synthesized Light,  λ -router, GWOR, and LightR for N-node networks, where  N = { 6 , 8 , 12 , 16 , 24 , 32 , 48 , 64 } . In each network, we removed the self-communication of nodes and considered the other communications among the nodes.
For an  N × N  topology with K MRRs, we proposed a fault model where each MRR had a p% chance of being defective. Each MRR was independent and thus not affected by the other MRRs. Therefore, we considered whether an MRR was defective as a Bernoulli trial and calculated the expected value of the defective MRRs using the formula:  K × p . For example, a  6 × 6  LightR has 24 MRRs. When the fault rate is  3 % , the number of defective MRRs in LightR equals  24 × 3 % = 1 . We considered eight fault rates for each network: 1%, 3%, 5%, 8%, 12%, 15%, 20%, and 25%. After obtaining the number of defective MRRs, we assigned that number in our fault model. For the resonant wavelength of each defective MRR, we randomly changed it to any of the other wavelengths in the topology or none of those wavelengths. Then, we calculated the number of error communications. For each network, we repeated the process 100 times, and we present the average number of error communications in Figure 12.
Generally, LightR provided a higher reliability than Light,  λ -router, and GWOR when the MRR faults occurred. In every network, regardless of the size or the fault rate, LightR had the lowest number of error communications. In some cases, LightR had no error communications. As shown in Figure 12a, when the fault rate was 3%, all communications worked correctly in a  6 × 6  LightR, i.e., no data loss, while at least one communication in the other three topologies had errors. As the size of networks or the fault rate increased, the advantages of the LightR in terms of the fault tolerance became even more significant. For example, for a 64-node network with a 3% fault rate, the LightR decreased the number of error communications by 85–90% compared to the other three topologies, as shown in Figure 12h. The superiority of the LightR in reliability is driven by reserving two independent paths for each master–slave pair that requires communication.
On the other hand, Light exhibited the most significant number of error communications among the four topologies. As introduced in Section 2.4, each MRR in Light is on-resonance with two signals. If an MRR is defective, the two signals have s-a-0 faults. Moreover, if the defective MRR is resonant with another wavelength, it may cause two extra signals to have s-a-1 faults. Therefore, Light is more sensitive to MRR faults than the other topologies and suffers a high data loss when MRR faults occur. Compared to Light,  λ -router and GWOR contained fewer error communications. The reason is that the CSE with two identified MRRs used in both topologies can keep the signals on their planned paths when one of the MRRs is defective. As shown in Figure 13, although one of the MRRs was defective and off-resonance with  λ i , the signal on  λ i  kept its planned propagation direction by being coupled to the other MRR of the CSE. However, the signal passed the crossing twice and thus generated more crossing loss and crosstalk noise than the original signal path represented by the dotted line in Figure 13. That degraded the system performance by increasing the insertion loss and decreasing the signal-to-noise ratio.

4.1.2. Discussion: MRR Usage, Insertion Loss, and Signal-to-Noise Ratio (SNR)

Firstly, we calculated the number of MRRs in Light,  λ -router, GWOR, and LightR for different sizes of networks. As shown in Table 2, for the same size of networks, Light and  λ -router have the fewest and most MRRs, respectively. LightR and GWOR had the same MRR usage, which was less than the MRR usage in the  λ -router. With that amount of MRRs, LightR doubled the number of signal paths compared to the other three topologies to tolerate the MRR faults.
Then, we compared LightR to Light,  λ -router, and GWOR in terms of the insertion loss and the SNR. In this case, we assumed that no MRRs were defective and calculated the insertion loss and SNR values in every topology, as introduced in Section 2.2. Specifically, the insertion loss of each signal in a topology equals the summation of the crossing loss, drop loss, and through loss. To calculate the SNR of a signal, we considered the noise generated by the MRRs and waveguide crossings and applied the equation introduced in Section 2.2. The loss and noise parameters [19] are shown in Table 3. Figure 14 and Figure 15 show the average and worst-case insertion loss and SNR, respectively.
The LightR, GWOR, and  λ -router had almost the same average insertion loss values, which were slightly larger than those of the Light. For example, a  64 × 64  Light decreased the average insertion loss by about 7.3% versus the other three topologies. The reduction in the Light was because it had the lowest MRR usage. For the worst-case insertion loss, LightR and GWOR had nearly the same values, which were higher than  λ -router and Light. The results show that doubling the signal paths in the LightR did not introduce much insertion loss.
Similar to the insertion loss, LightR and GWOR had almost the same behavior in the SNR. For the worst-case SNR, LightR and GWOR suffered the most crosstalk noise, especially when the networks were large, such as a 64-node network. For large-scale networks, the  λ -router had a higher worst-case SNR but a lower average SNR than the other three topologies. For example, for a 64-node network, Light and LightR increased the average SNR by 47% and 12% compared to the  λ -router, respectively. Among all the topologies, Light achieved the best signal quality as it exhibited the lowest MRR usage. However, it had more error communications than the other three topologies. The results demonstrate that doubling the number of signal paths in LightR did not generate much crosstalk noise.

4.2. Comparison to RobustONoC

We compared LightR to a state-of-the-art fault-tolerant design method for topologies, RobustONoC [15]. The idea of the RobustONoC is to add backup MRRs and waveguides into an existing topology to route a deviated signal back to its designated path. For each signal, the RobustONoC prepares two extra MRRs as backups. When modifying the structure of a topology, the RobustONoC randomly determines the positions of the extra waveguides and inserts backup MRRs at the end of each waveguide. According to the MRR usage results reported in [15], the LightR decreased the MRR usage by more than half versus the RobustONoC. For example, for a six-node network, the LightR decreased the number of MRRs by about 64% compared to the RobustONoC. The high MRR usage in the RobustONoC was mainly driven by inserting twice the number of MRRs into a topology.
More importantly, regardless of the MRR usage in a topology, the RobustONoC considers only the single-fault model, which means that only one MRR can be defective at a time, and the other MRRs should work well. However, in practice, there can be multiple defective MRRs, even for a network with low MRR usage. Based on these facts, the RobustONoC can hardly promise reliable communication for WRONoCs. In contrast, the LightR showed its superiority in enhancing the reliability by significantly reducing the number of error communications in our realistic fault model.

5. Conclusions

In this paper, we proposed the first fault-tolerant WRONoC topology, LightR, which reserved two independent signal paths for each communication. When constructing the signal paths, we applied PSEs to minimize the MRR usage. With a  4 × 4  HashR as the basic building block, the LightR can easily be implemented to support a network with N nodes at any scale. Moreover, we proposed the first realistic fault model, which contained defective MRRs depending on the fault rate of the MRRs. Based on the fault model, we compared the LightR to typical state-of-the-art WRONoC topologies. According to the results, the LightR outperformed them in enhancing the reliability by reducing the number of error communications compared to the other topologies. In some cases, the LightR ensured that no communications failed, i.e., no data loss. Compared to a state-of-the-art fault-tolerant design method, RobustONoC, the LightR improved the reliability while decreasing the MRR usage by more than half.

Author Contributions

Conceptualization, investigation, visualization, software, writing—original draft, Z.Z.; Validation and writing—review and editing, M.L.; Supervision, funding acquisition and writing—review and editing, T.-M.T.; Writing—review and editing, U.S. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) Project Number 496766278.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
ONoCOptical networks-on-chip
WDMWavelength-division-multiplexing
WRONoCWavelength-routed optical networks-on-chip
OSEOptical switching element
MRRMicroring resonator
CSECrossing switching element
PSEParallel switching element
SNRSignal-to-noise ratio

References

  1. Vantrease, D.; Schreiber, R.; Monchiero, M.; McLaren, M.; Jouppi, N.P.; Fiorentino, M.; Davis, A.; Binkert, N.; Beausoleil, R.G.; Ahn, J.H. Corona: System Implications of Emerging Nanophotonic Technology. ACM Sigarch Comput. Archit. News 2008, 36, 153–164. [Google Scholar] [CrossRef]
  2. Tseng, T.M.; Truppel, A.; Li, M.; Nikdast, M.; Schlichtmann, U. Wavelength-Routed Optical NoCs: Design and EDA—State of the Art and Future Directions: Invited Paper. In Proceedings of the 2019 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), Westminster, CO, USA, 4–7 November 2019; pp. 1–6. [Google Scholar] [CrossRef]
  3. Werner, S.; Navaridas, J.; Luján, M. A Survey on Optical Network-on-Chip Architectures. ACM Comput. Surv. 2017, 50, 1–37. [Google Scholar] [CrossRef]
  4. Xie, Y.; Nikdast, M.; Xu, J.; Zhang, W.; Li, Q.; Wu, X.; Ye, Y.; Wang, X.; Liu, W. Crosstalk Noise and Bit Error Rate Analysis for Optical Network-on-Chip. In Proceedings of the 47th Design Automation Conference (DAC). Association for Computing Machinery, Anaheim, CA, USA, 13–18 June 2010; pp. 657–660. [Google Scholar] [CrossRef]
  5. Werner, S.; Navaridas, J.; Luján, M. Amon: An Advanced Mesh-like Optical NoC. In Proceedings of the 2015 IEEE 23rd Annual Symposium on High-Performance Interconnects, Santa Clara, CA, USA, 26–28 August 2015; pp. 52–59. [Google Scholar] [CrossRef] [Green Version]
  6. Li, M.L.; Tseng, T.M.; Bertozzi, D.; Tala, M.; Schlichtmann, U. CustomTopo: A Topology Generation Method for Application-Specific Wavelength-Routed Optical NoCs. In Proceedings of the 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), New York, NY, USA, 5–8 November 2018; pp. 1–8. [Google Scholar] [CrossRef]
  7. Li, M.; Tseng, T.M.; Tala, M.; Schlichtmann, U. Maximizing the Communication Parallelism for Wavelength-Routed Optical Networks-On-Chips. In Proceedings of the 2020 Asia and South Pacific Design Automation Conference (ASP-DAC), Beijing, China, 13–16 January 2020; pp. 109–114. [Google Scholar] [CrossRef]
  8. Truppel, A.; Tseng, T.M.; Bertozzi, D.; Alves, J.C.; Schlichtmann, U. PSION: Combining Logical Topology and Physical Layout Optimization for Wavelength-Routed ONoCs. In Proceedings of the 2019 International Symposium on Physical Design (ISPD), San Francisco, CA, USA, 14–17 April 2019; pp. 49–56. [Google Scholar] [CrossRef]
  9. Truppel, A.; Tseng, T.M.; Bertozzi, D.; Alves, J.C.; Schlichtmann, U. PSION+: Combining Logical Topology and Physical Layout Optimization for Wavelength-Routed ONoCs. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 2020, 39, 5197–5210. [Google Scholar] [CrossRef]
  10. Truppel, A.; Tseng, T.M.; Schlichtmann, U. PSION 2: Optimizing Physical Layout of Wavelength-Routed ONoCs for Laser Power Reduction. In Proceedings of the 39th International Conference on Computer-Aided Design (ICCAD), Virtual, 2–5 November 2020; Association for Computing Machinery: New York, NY, USA, 2020. [Google Scholar] [CrossRef]
  11. Zheng, Z.; Li, M.; Tseng, T.M.; Schlichtmann, U. ToPro: A Topology Projector and Waveguide Router for Wavelength-Routed Optical Networks-on-Chip. In Proceedings of the 2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD), Munich, Germany, 1–4 November 2021; pp. 1–9. [Google Scholar] [CrossRef]
  12. Meyer, M.C.; Ahmed, A.B.; Okuyama, Y.; Abdallah, A.B. FTTDOR: Microring Fault-resilient Optical Router for Reliable Optical Network-on-Chip Systems. In Proceedings of the 2015 IEEE 9th International Symposium on Embedded Multicore/Many-core Systems-on-Chip, Turin, Italy, 23–25 September 2015; pp. 227–234. [Google Scholar] [CrossRef]
  13. Nitta, C.J.; Farrens, M.K.; Akella, V. Resilient Microring Resonator Based Photonic Networks. In Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture, Porto Alegre, Brazil, 3–7 December 2011; pp. 95–104. [Google Scholar] [CrossRef]
  14. Meyer, M.; Abdallah, A.B. Fault-tolerant Photonic Network-on-Chip. In Photonic Interconnects for Computing Systems: Understanding and Pushing Design Challenges; River Publishers: Aalborg, Denmark, 2017; pp. 281–318. [Google Scholar]
  15. Chuang, Y.K.; Zhong, Y.; Cheng, Y.H.; Yu, B.Y.; Fang, S.Y.; Li, B.; Schlichtmann, U. RobustONoC: Fault-Tolerant Optical Networks-on-Chip with Path Backup and Signal Reflection. In Proceedings of the 2021 22nd International Symposium on Quality Electronic Design (ISQED), Santa Clara, CA, USA, 7–9 April 2021; pp. 67–72. [Google Scholar] [CrossRef]
  16. Zheng, Z.; Li, M.; Tseng, T.M.; Schlichtmann, U. Light: A Scalable and Efficient Wavelength-Routed Optical Networks-On-Chip Topology. In Proceedings of the 2021 Asia and South Pacific Design Automation Conference (ASP-DAC), Tokyo, Japan, 18–21 January 2021; pp. 568–573. [Google Scholar] [CrossRef]
  17. Briere, M.; Girodias, B.; Bouchebaba, Y.; Nicolescu, G.; Mieyeville, F.; Gaffiot, F.; O’Connor, I. System Level Assessment of an Optical NoC in an MPSoC Platform. In Proceedings of the 2007 Design, Automation Test in Europe Conference Exhibition (DATE), Nice, France, 16–20 April 2007; pp. 1–6. [Google Scholar] [CrossRef]
  18. Tan, X.; Yang, M.; Zhang, L.; Jiang, Y.; Yang, J. On a Scalable, Non-Blocking Optical Router for Photonic Networks-on-Chip Designs. In Proceedings of the 2011 Symposium on Photonics and Optoelectronics (SOPO), Wuhan, China, 16–18 May 2011; pp. 1–4. [Google Scholar] [CrossRef] [Green Version]
  19. Nikdast, M.; Xu, J.; Duong, L.H.K.; Wu, X.; Wang, X.; Wang, Z.; Wang, Z.; Yang, P.; Ye, Y.; Hao, Q. Crosstalk Noise in WDM-Based Optical Networks-on-Chip: A Formal Study and Comparison. IEEE Trans. Very Large Scale Integr. Vlsi Syst. 2015, 23, 2552–2565. [Google Scholar] [CrossRef]
  20. Zheng, Z.; Li, M.; Tseng, T.M.; Schlichtmann, U. XRing: A Crosstalk-Aware Synthesis Method for Wavelength-Routed Optical Ring Routers. In Proceedings of the 2023 Design, Automation Test in Europe Conference Exhibition (DATE), Antwerp, Belgium, 17–19 April 2023; pp. 1–9. [Google Scholar]
  21. Nitta, C.; Farrens, M.; Akella, V. Addressing system-level trimming issues in on-chip nanophotonic networks. In Proceedings of the 2011 IEEE 17th International Symposium on High Performance Computer Architecture, San Antonio, TX, USA, 12–16 February 2011; pp. 122–131. [Google Scholar] [CrossRef] [Green Version]
  22. Bogaerts, W.; De Heyn, P.; Van Vaerenbergh, T.; De Vos, K.; Kumar Selvaraja, S.; Claes, T.; Dumon, P.; Bienstman, P.; Van Thourhout, D.; Baets, R. Silicon microring resonators. Laser Photonics Rev. 2012, 6, 47–73. [Google Scholar] [CrossRef]
  23. Ramini, L.; Grani, P.; Bartolini, S.; Bertozzi, D. Contrasting wavelength-routed optical NoC topologies for power-efficient 3d-stacked multicore processors using physical-layer analysis. In Proceedings of the 2013 Design, Automation Test in Europe Conference Exhibition (DATE), Grenoble, France, 18–22 March 2013; pp. 1589–1594. [Google Scholar] [CrossRef] [Green Version]
  24. Xiao, M.; Tseng, T.M.; Schlichtmann, U. FAST: A Fast Automatic Sweeping Topology Customization Method for Application-Specific Wavelength-Routed Optical NoCs. In Proceedings of the 2021 Design, Automation Test in Europe Conference Exhibition (DATE), Grenoble, France, 1–5 February 2021; pp. 1651–1656. [Google Scholar] [CrossRef]
Figure 1. (a) A simple WRONoC topology. (b) A 2 × 2 CSE structure. (c) On-resonance signals change their propagation directions. (d) Off-resonance signals pass through the CSE without direction change.
Figure 1. (a) A simple WRONoC topology. (b) A 2 × 2 CSE structure. (c) On-resonance signals change their propagation directions. (d) Off-resonance signals pass through the CSE without direction change.
Applsci 13 08871 g001
Figure 2. The MRR in OSE1 is defective and the signal on  λ i  fails to be coupled to the MRR.
Figure 2. The MRR in OSE1 is defective and the signal on  λ i  fails to be coupled to the MRR.
Applsci 13 08871 g002
Figure 3. A 2 × 2 PSE supports (a) two on-resonance signals and (b) two off-resonance signals.
Figure 3. A 2 × 2 PSE supports (a) two on-resonance signals and (b) two off-resonance signals.
Applsci 13 08871 g003
Figure 4. The first-order noise is generated (a) when a signal passes a waveguide crossing, (b) when a signal passes an off-resonance MRR, and (c) when a signal is on-resonance with an MRR.
Figure 4. The first-order noise is generated (a) when a signal passes a waveguide crossing, (b) when a signal passes an off-resonance MRR, and (c) when a signal is on-resonance with an MRR.
Applsci 13 08871 g004
Figure 5. (a) The s-a-0 signal fault. (b) The s-a-1 signal fault.
Figure 5. (a) The s-a-0 signal fault. (b) The s-a-1 signal fault.
Applsci 13 08871 g005
Figure 6. (a) A  4 × 4   λ -router. (b) A  4 × 4  Hash.
Figure 6. (a) A  4 × 4   λ -router. (b) A  4 × 4  Hash.
Applsci 13 08871 g006
Figure 7. The logic scheme of the HashR.
Figure 7. The logic scheme of the HashR.
Applsci 13 08871 g007
Figure 8. (a) The signals on  λ 1  and  λ 3  suffer the s-a-0 and s-a-1 fault, respectively. (b m 1  communicates with  s 2  and  s 4  using the signals on  λ 4  and  λ 2 , respectively.
Figure 8. (a) The signals on  λ 1  and  λ 3  suffer the s-a-0 and s-a-1 fault, respectively. (b m 1  communicates with  s 2  and  s 4  using the signals on  λ 4  and  λ 2 , respectively.
Applsci 13 08871 g008
Figure 9. The  N × N  LightR structure.
Figure 9. The  N × N  LightR structure.
Applsci 13 08871 g009
Figure 10. The wavelength-set assignment for an  8   ×   8  LightR topology.
Figure 10. The wavelength-set assignment for an  8   ×   8  LightR topology.
Applsci 13 08871 g010
Figure 11. An  8 × 8  LightR topology.
Figure 11. An  8 × 8  LightR topology.
Applsci 13 08871 g011
Figure 12. The average number of error communications in Light,  λ -router, GWOR, and LightR for (a) 6- (b) 8- (c) 12- (d) 16- (e) 24- (f) 32- (g) 48- (h) 64-node networks, respectively.
Figure 12. The average number of error communications in Light,  λ -router, GWOR, and LightR for (a) 6- (b) 8- (c) 12- (d) 16- (e) 24- (f) 32- (g) 48- (h) 64-node networks, respectively.
Applsci 13 08871 g012
Figure 13. The signal on  λ i  can keep its planned propagation direction if only one MRR is defective.
Figure 13. The signal on  λ i  can keep its planned propagation direction if only one MRR is defective.
Applsci 13 08871 g013
Figure 14. The average and worst-case insertion loss in Light,  λ -router, GWOR, and LightR.
Figure 14. The average and worst-case insertion loss in Light,  λ -router, GWOR, and LightR.
Applsci 13 08871 g014
Figure 15. The average and worst-case SNR in Light,  λ -router, GWOR, and LightR.
Figure 15. The average and worst-case SNR in Light,  λ -router, GWOR, and LightR.
Applsci 13 08871 g015
Table 1. The wavelength assignment results of the  8 × 8  LightR topology.
Table 1. The wavelength assignment results of the  8 × 8  LightR topology.
  m 1   m 2   m 3   m 4   m 5   m 6   m 7   m 8
  s 1 ×   λ 11 , λ 12   λ 7 , λ 8   λ 3 , λ 4 λ 13 , λ 14 λ 15 λ 16   λ 9 , λ 10   λ 5 , λ 6   λ 1 , λ 2
  s 2   λ 11 , λ 12 ×   λ 3 , λ 4 λ 15 λ 16   λ 9 , λ 10 λ 5 , λ 6 λ 7 , λ 8   λ 1 , λ 2 λ 13 λ 14
  s 3   λ 7 , λ 8   λ 3 , λ 4 ×   λ 11 , λ 12   λ 5 , λ 6   λ 1 , λ 2 λ 13 , λ 14 λ 15 λ 16   λ 9 , λ 10
  s 4   λ 3 , λ 4 λ 15 λ 16   λ 11 , λ 12 ×   λ 1 , λ 2 λ 13 λ 14   λ 9 , λ 10 λ 5 , λ 6 λ 7 , λ 8
  s 5 λ 13 , λ 14 λ 15 λ 16   λ 9 , λ 10   λ 5 , λ 6   λ 1 , λ 2 ×   λ 11 , λ 12   λ 7 , λ 8   λ 3 , λ 4
  s 6   λ 9 , λ 10 λ 5 , λ 6 λ 7 , λ 8   λ 1 , λ 2 λ 13 λ 14   λ 11 , λ 12 ×   λ 3 , λ 4 λ 15 λ 16
  s 7   λ 5 , λ 6   λ 1 , λ 2 λ 13 , λ 14 λ 15 λ 16   λ 9 , λ 10   λ 7 , λ 8   λ 3 , λ 4 ×   λ 11 , λ 12
  s 8   λ 1 , λ 2 λ 13 λ 14   λ 9 , λ 10 λ 5 , λ 6 λ 7 , λ 8   λ 3 , λ 4 λ 15 λ 16   λ 11 , λ 12 ×
The entry × means there is no signal path between the master and the slave.
Table 2. The MRR usage in Light,  λ -router, GWOR, and LightR for different sizes of networks.
Table 2. The MRR usage in Light,  λ -router, GWOR, and LightR for different sizes of networks.
The Number of Nodes
N = 6 N = 8 N = 12 N = 16 N = 24 N = 32 N = 48 N = 64
Light12246011224048011041984
λ -router305613224055299222564032
GWOR244812022448096022083968
LightR244812022448096022083968
Table 3. Insertion loss and crosstalk noise parameters.
Table 3. Insertion loss and crosstalk noise parameters.
Insertion Loss ParametersCrosstalk Noise Parameters
Drop lossThrough lossCrossing lossCrosstalk per MRRCrosstalk per crossing
  0.5  db   0.005  db   0.04  db 25 db40 db
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zheng, Z.; Li, M.; Tseng, T.-M.; Schlichtmann, U. LightR: A Fault-Tolerant Wavelength-Routed Optical Networks-on-Chip Topology. Appl. Sci. 2023, 13, 8871. https://doi.org/10.3390/app13158871

AMA Style

Zheng Z, Li M, Tseng T-M, Schlichtmann U. LightR: A Fault-Tolerant Wavelength-Routed Optical Networks-on-Chip Topology. Applied Sciences. 2023; 13(15):8871. https://doi.org/10.3390/app13158871

Chicago/Turabian Style

Zheng, Zhidan, Mengchu Li, Tsun-Ming Tseng, and Ulf Schlichtmann. 2023. "LightR: A Fault-Tolerant Wavelength-Routed Optical Networks-on-Chip Topology" Applied Sciences 13, no. 15: 8871. https://doi.org/10.3390/app13158871

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop