A NEW LOOKAHEAD ALGORITHM TO IMPROVE THE PERFORMANCE OF OMEGA NETWORKS

In this paper, we concentrate on the internal blocking problem related to Omega Network which is a type of Multistage Interconnection Networks (MINs). We introduce a new algorithm to alleviate the internal blocking problem, the function of this look-ahead algorithm is to decide which packet to block in case of internal blocking. The performance results indicate that the introduced algorithm decreases the internal blocking problem significantly. Since the computations involved in the introduced algorithm are so little, it can be implemented in real time effectively. Key wordsMINs, Omega networks, Internal blocking, Look-ahead algorithm.


INTRODUCTION
Omega Network is a type of MINs, in which, 2   b a .The general architecture of an Omega Network consists of a few stages of a number of switching elements (SEs), each stage is connected with the next stage with a specific interconnection called: perfect shuffle, except the last stage, in which the identity permutation is used.The perfect shuffle permutation is performed by rotating the binary representation of the input link one bit to the left [1,2,3].So, if the binary representation of an input link is : Then the packet will be routed to a destination with a binary format : The perfect shuffle connection of Omega Networks simply divides the Nchannels into two halves, which are then interleaved perfectly.Basically, the 2 2  SE has two configuration, straight or exchange as illustrated in Fig. 1 In general, Omega Network with N inputs consists of N n 2 log  identical stages, and each stage consists of a perfect shuffle connection followed by N/2 2 x2 switch elements.Since each switch element has two configurations, namely, straight and exchange; therefore number of realizable permutations by the network is [1]: For example, an MINs are used because they are less expensive, easy to control, have low delay and support large scale of inputs/outputs.One of applications of MINs is in processor to memory communication in parallel multiprocessing systems, in which, they allow a direct link between any processor to any memory module so the processor can access any memory module with a very small number of communications or accessing conflict.
Internal blocking can be eliminated by increasing the number of switch elements [4], the number of stages [4], or the size of the switch element [5].However, all these techniques increase the cost and delay of such networks.Therefore, in this paper we attempt to use as few switch elements as possible, while maintaining the full accessibility by using our new technique.The Performance of Omega Networks 158

ALGORITHM
Internal blocking occurs in an omega network switch when two input packets attempt to go to the same output link.When an internal blocking occurs, one of the two conflicting packets must be blocked, after that, the other packet can be passed without conflict.Normaly, when internal blocking occurred, it has been resolved by randomly blocking one input packet and passing the other input packet, we call this Random blocking.One may always blocks the lower input and passes the other one, or always blocks the upper input and passes the other one, we call this fixed blocking.Since each of the two contending input packets has different routing path, we expect that the choice of which packet that should be blocked will affect the internal blocking in the next stages and thus will affect the network performance.In this paper, an algorithm has been introduced to increase the performance of the omega networks.The function of this algorithm is to decide which packet should be blocked when an internal blocking occurs.

Effect of the decision of which packet to block
The selection of the packet that should be blocked in the case of internal blocking affects the internal blocking count in the next stages remarkably.The blocking selection effect may be illustrated by an example.16  omega network.In this example, inputs are idle except the inputs 0, 2, 4, 8 and 15 which have the following destinations: 0,1,7, and 0 respectively.In Fig 2 .1,we associate an ordered pair for each input link.The first number in the ordered pair is the input link index, whereas the second number indicates the destination.In this example, there are four internal blocking cases, one in each stage.In stage 1, internal blocking occurs in switch 0 between inputs 0 and 8.In stage 2, internal blocking occurs in switch 0 between inputs 0 and 4. In stage 3, internal blocking occurs in switch 0 between the inputs 0 and 2. In stage 4, internal blocking occurs in switch 0 between inputs 0 and 15.Note that inputs 8, 4, 2 will go to the lower output port of the SEs in stages 2, 3, 4 respectivley.Hence if we block input 0 in stage 1, there will be no blocking between inputs 8 and 4 in stage 2, inputs 4 and 2 in stage 3, and inputs 2 and 15 in stage 4. In this particular example, there are 5 possibilities based on the blocking choice in each stage, the possibilities are: Based on Table 2.1, it is clear that the decision of which input to block affects the total number of blocked inputs and thus affects the performance.In this particular example, the best performance can be achieved in case#5 when input 0 is blocked in stage 1, whereas the worst performance occurs in case#1 and case#2.

The Algorithm
Input packets are routed to their destinations in one cycle.The cycle involves routing the input packets through stage 1, stage2, .. up to the last stage.Before the cycle starts, the algorithm processes the input packets and block the input packets that cause more internal blocking.This pre-processing is done by first generating a weight and a contending set for each input packet.The generated weight assigned for a packet corresponds to the total number of input packets that causes internal blocking (contention) with that packet and the contending set indicates the indices of the contending packets.After all weights and sets are generated, a virtual cycle (inside the central controller) begins.During the cycle, when an internal blocking occurs, the algorithm selects the packet with highest weight for blocking and after that, it updates the input packets weights and sets related to the blocked packet.
To illustrate the algorithm, consider the following example: a 16 16  omega network is shown in Fig 2 .2.In this example, the first number in the ordered pair is the input link index, whereas the second number indicates the destination.In this example, there are 7 internal blocking cases, three in stage 1, two in stage 3 and two in stage 4. The algorithm will generate Table 2   After the algorithm generates Table 2.2, the cycle starts as follows: In stage 1: there is an internal blocking caused by inputs 7 and 15 in switch 7. Since input 15 weight is 4 whereas input 7 weight is 3, input 15 is selected for blocking since it generates more contentions in the next stages, after blocking input 15, the algorithm updates all input weights related to input 15.In switch 0, inputs 0 and 8 have the same weight, we choose lower input (8) for blocking, in switch 1, inputs 1 and 9 have the same weight , we choose lower input (9) for blocking and we update the table of weights.Table 2.3 shows the weights of inputs after update.In stage 2 : there is no blocking.
In stage 3 : the internal blocking in switch 6 between input 1 and input 7 is resolved by blocking input 1 since it's weight is 2 while the weight of input 7 is 1, then we update the weights.The weights become as shown in Table 2.4.Since all weights are zeros, no blocking will occur in the remaining stages (which is stage 4).

Generation of weights and contending sets:
The weight associated to an input packet is computed by counting all input packets that may cause internal blocking with the packet.The internal blocking between two packets can be detected by comparing partial bits of the destination bits of the two packets.The partial bits of the destinations the two packets that are used in the comparison depends on the stage in which the two packets may contend.
The procedure used to compute the weight and contending set of a packet from an input link X is as follows: Based on the binary representation of input X, the other N-1 input packets can be partitioned into n sets, as follows : To illustrate how the procedure works, consider a 16 16  omega network.To evaluate the weight of an input packet X, the other inputs are partitioned in 4 sets, according to the stage in which the set elements may contend with the input X at.In stage 1, the input X is routed to the link (according to omega network self routing feature), there is only one input that may contend with X at stage 1, this packet must come from link . In stage 2, the input X is routed to the link .Therefore, the sets associated with input X are:

PERFORMANCE RESULTS
A simulation program has been implemented to measure the performance of the introduced algorithm.The simulation program has been executed for 10 arrival rates, 0.1, 0.2, 0.3, , 0.9 , 1.0.For each arrival rate, the algorithm has been executed for 2,500 random input connections.Furthermore, the previous process has been repeated 10 times to compute the confidence interval.
In the figures, there are four curves, each one corresponds to a blocking selection strategy, and the four selection strategies are:  Random Select: when there is an internal blocking, select one packet randomly for dropping. Fixed Select: when there is an internal blocking, always select the upper (or always select the lower) packet for dropping. Best Select: when there is an internal blocking, use the algorithm to drop the packet that will cause more blocking in next stages.(This expected to result in a minimum dropping probability). Worst Select: when there is an internal blocking, use the algorithm to drop the packet that will cause less blocking in next stages.(This expected to result in a maximum dropping probability).2 indicates that the improvement in dropping probability as a result of using the algorithm becomes more effective as the network size increases.The reason is that we will have more stages and thus the look-ahead algorithm (which looks to next stages) becomes more effective.

CONCLUSION
To apply the algorithm we use the concept of centralized controller, in which, there is a central controller connected to Omega network switches.The central controler detects any internal blocking in a path between any source and any destination.If internal blocking is detected, the decision of which packet to block and which one to route affects the blocking probability and thus affects the network performance.Results obtained from simulation indicated that the look-ahead packet blocking algorithm introduced in this paper decreases the blocking probability.Hence we need fewer buffers to store the blocked packets.

Figure 2 . 1 Fig 2 . 1
Figure 2.1 Example illustrates the effect of the choice of the packet that should be blocked in case of internal blocking

Figure 2 . 2
Figure 2.2 Example illustrates the algorithm for all elements Y with destination Z in set RX,i if ) WX=WX+1; ** Contention occurs, so increment packet weight ** CS=CS+{Y} ** Add Y to the contending list ** packets which may contend with packet X at stage 2, they come

Table 2 . 1
Example shows the effect of blocking choice in the performance.

Table 2 . 2
.2.The values computed by the algorithm for the example shown in Fig 2.2, before the cycle starts

Table 2 . 3
The values computed by the algorithm for the example shown in Fig 2.2, after stage 1

Table 2 . 4
The values computed by the algorithm for the example shown in Fig 2.2, after stage 3