Optimizing Crosstalk in Optical NoC through Heuristic Fusion Mapping

: Optical network-on-chip is considered to be a promising technology to solve the problems of low bandwidth and high latency in the traditional interconnection network. However, due to the inevitable leakage of optical devices, the optical signal will receive crosstalk noise during transmission. In this paper, a heuristic fusion mapping algorithm PSO_SA for crosstalk optimization is proposed. First, the initial optimal mapping is obtained by particle swarm optimization, and then the local optimization of the mapping scheme is removed by combining with simulated annealing algorithm. The experimental results show that the crosstalk optimization performance of PSO_SA algorithm is better than that of GA algorithm in 263 dec, Wavelet, DVOPD and other applications, and the maximum optimization degree is 28.7%.

L) described by the topology, routing algorithm and optical router mechanism, which represents the connection mode between the topological nodes, where ti ∈ T represents the topological nodes in the network on the optical chip, and li,j ∈ L is the physical link connecting the topological nodes ti and tj. The mapping from communication graph CG = G(C,E) to topology graph X(T, L) can be defined by one-to-one mapping function on the premise of |C| ≤ |T|: Ω: C→T, s.t. Ω(ci) = ti, ∀ci ∈ C, ∃ti ∈ T (1) This definition means that each task in an application should be mapped to a topology node, and each topology node only corresponds to one task. Figure 1 shows the mapping of an application with eight tasks in mesh topology. Because the task communication relationship in an application is fixed, when the task is mapped to the on-chip network by mapping algorithm, the relative position of tasks will be determined. The crosstalk in the on-chip network is largely from the communication intersection, and the relative position of tasks will greatly affect the communication intersection of tasks. Therefore, the crosstalk in the on-chip network can be reduced by optimizing the mapping algorithm. When the mapping algorithm is used to optimize the crosstalk in the optical network, the evaluation value of the mapping algorithm needs to correspond to the crosstalk performance of the network, so as to improve the evaluation value as the standard to find the optimal mapping method.
In this paper, we design a heuristic fusion mapping algorithm PSO_SA to optimize crosstalk. The optimal mapping scheme is determined by two steps. First, the initial optimal mapping is obtained by the particle swarm optimization algorithm, and then the local optimization of the mapping scheme is removed by combining the simulated annealing algorithm.

Design of Heuristic Fusion Mapping Algorithm for Crosstalk Optimization
The general idea of our mapping algorithm is to integrate particle swarm optimization and simulated annealing algorithm (SA) [26][27][28] and combine the expansibility of PSO with the avoidance of local optimality of SA. The purpose of the mapping algorithm is to map the application tasks to the topology nodes. The relationship between the tasks is represented by the communication graph, and the relationship between the topology nodes is represented by the topology graph [29]. Our mapping algorithm is mainly for two-dimensional topological networks. Firstly, the two-dimensional topological network of M * N is transformed into the arrangement of straight lines. The topological nodes are numbered as 1, 2, 3,…, M * N−1, M * N, as shown in Figure 2. The final mapping scheme needs to place the task in a grid in the graph.  Because the task communication relationship in an application is fixed, when the task is mapped to the on-chip network by mapping algorithm, the relative position of tasks will be determined. The crosstalk in the on-chip network is largely from the communication intersection, and the relative position of tasks will greatly affect the communication intersection of tasks. Therefore, the crosstalk in the on-chip network can be reduced by optimizing the mapping algorithm. When the mapping algorithm is used to optimize the crosstalk in the optical network, the evaluation value of the mapping algorithm needs to correspond to the crosstalk performance of the network, so as to improve the evaluation value as the standard to find the optimal mapping method.
In this paper, we design a heuristic fusion mapping algorithm PSO_SA to optimize crosstalk. The optimal mapping scheme is determined by two steps. First, the initial optimal mapping is obtained by the particle swarm optimization algorithm, and then the local optimization of the mapping scheme is removed by combining the simulated annealing algorithm.

Design of Heuristic Fusion Mapping Algorithm for Crosstalk Optimization
The general idea of our mapping algorithm is to integrate particle swarm optimization and simulated annealing algorithm (SA) [26][27][28] and combine the expansibility of PSO with the avoidance of local optimality of SA. The purpose of the mapping algorithm is to map the application tasks to the topology nodes. The relationship between the tasks is represented by the communication graph, and the relationship between the topology nodes is represented by the topology graph [29]. Our mapping algorithm is mainly for two-dimensional topological networks. Firstly, the two-dimensional topological network of M * N is transformed into the arrangement of straight lines. The topological nodes are numbered as 1, 2, 3, . . . , M * N − 1, M * N, as shown in Figure 2. The final mapping scheme needs to place the task in a grid in the graph.
Electronics 2020, 9, x FOR PEER REVIEW 2 of 10 L) described by the topology, routing algorithm and optical router mechanism, which represents the connection mode between the topological nodes, where ti ∈ T represents the topological nodes in the network on the optical chip, and li,j ∈ L is the physical link connecting the topological nodes ti and tj. The mapping from communication graph CG = G(C,E) to topology graph X(T, L) can be defined by one-to-one mapping function on the premise of |C| ≤ |T|: This definition means that each task in an application should be mapped to a topology node, and each topology node only corresponds to one task. Figure 1 shows the mapping of an application with eight tasks in mesh topology. Because the task communication relationship in an application is fixed, when the task is mapped to the on-chip network by mapping algorithm, the relative position of tasks will be determined. The crosstalk in the on-chip network is largely from the communication intersection, and the relative position of tasks will greatly affect the communication intersection of tasks. Therefore, the crosstalk in the on-chip network can be reduced by optimizing the mapping algorithm. When the mapping algorithm is used to optimize the crosstalk in the optical network, the evaluation value of the mapping algorithm needs to correspond to the crosstalk performance of the network, so as to improve the evaluation value as the standard to find the optimal mapping method.
In this paper, we design a heuristic fusion mapping algorithm PSO_SA to optimize crosstalk. The optimal mapping scheme is determined by two steps. First, the initial optimal mapping is obtained by the particle swarm optimization algorithm, and then the local optimization of the mapping scheme is removed by combining the simulated annealing algorithm.

Design of Heuristic Fusion Mapping Algorithm for Crosstalk Optimization
The general idea of our mapping algorithm is to integrate particle swarm optimization and simulated annealing algorithm (SA) [26][27][28] and combine the expansibility of PSO with the avoidance of local optimality of SA. The purpose of the mapping algorithm is to map the application tasks to the topology nodes. The relationship between the tasks is represented by the communication graph, and the relationship between the topology nodes is represented by the topology graph [29]. Our mapping algorithm is mainly for two-dimensional topological networks. Firstly, the two-dimensional topological network of M * N is transformed into the arrangement of straight lines. The topological nodes are numbered as 1, 2, 3,…, M * N−1, M * N, as shown in Figure 2. The final mapping scheme needs to place the task in a grid in the graph.  The mapping algorithm needs to constantly adjust the task position in the topology structure, through which the crosstalk of optical network-on-chip is gradually reduced. Therefore, the method to adjust the task location is very important in the mapping algorithm. The mapping algorithm we designed includes two parts-one is to use particle swarm optimization, the other is to use simulated annealing optimization.
In particle swarm optimization, each dimension of particle velocity and position corresponds to a task in an application. The update mode of particle speed and position is shown in Equations (2) and (3).
where v i and x i are the velocity and position of particles, respectively; ω is the inertia factor; c 1 and c 2 are individual and whole learning factors respectively, equal to 1; r 1 and r 2 are random numbers between 0 and 1; pb i is the current optimal solution of a particle; and gb is the global optimal solution of particle swarm. The adjustment of ω is in accordance with Equation (4).
where run is the current number of iterations; run max is the total number of iterations; and s is equal to 15 when the algorithm is designed. Each dimension of particle speed and position corresponds to a task in the application. When an application mapping is completed, each task corresponds to a labeled topology node. When the speed of the corresponding dimension of a task in an application is greater than 0, the mapping position of the task moves to the right end of the line topology. Otherwise, the mapping position of the task moves to the left end of the line topology, and moves a certain distance according to the absolute value of the speed. Two points need to be paid attention to during the implementation of the actual algorithm: (1) when the positions coincide, move to the right end and left end, respectively, according to the positive and negative speed until the idle topology node is found; (2) when the left and right sides are out of bounds, find the mapping position by modulus value. Figure 3 shows the movement adjustment of a nine-task application map. Suppose that the initial task mapping relationship of the application is task 1 corresponding to topology node 1, task 2 corresponding to topology node 2, and so on. Task 9 corresponding to topology node 9, and the speeds of task 1 to task 9 are +2, +1, −3, +5, −1, −3, +2, +1, +4, respectively. Electronics 2020, 9, x FOR PEER REVIEW 3 of 10 The mapping algorithm needs to constantly adjust the task position in the topology structure, through which the crosstalk of optical network-on-chip is gradually reduced. Therefore, the method to adjust the task location is very important in the mapping algorithm. The mapping algorithm we designed includes two parts-one is to use particle swarm optimization, the other is to use simulated annealing optimization.
In particle swarm optimization, each dimension of particle velocity and position corresponds to a task in an application. The update mode of particle speed and position is shown in Equations (2) and (3).
where v i and x i are the velocity and position of particles, respectively; is the inertia factor; c 1 and c 2 are individual and whole learning factors respectively, equal to 1; r 1 and r 2 are random numbers between 0 and 1; pb i is the current optimal solution of a particle; and gb is the global optimal solution of particle swarm. The adjustment of ω is in accordance with Equation (4).
where run is the current number of iterations; run max is the total number of iterations; and s is equal to 15 when the algorithm is designed. Each dimension of particle speed and position corresponds to a task in the application. When an application mapping is completed, each task corresponds to a labeled topology node. When the speed of the corresponding dimension of a task in an application is greater than 0, the mapping position of the task moves to the right end of the line topology. Otherwise, the mapping position of the task moves to the left end of the line topology, and moves a certain distance according to the absolute value of the speed. Two points need to be paid attention to during the implementation of the actual algorithm: (1) when the positions coincide, move to the right end and left end, respectively, according to the positive and negative speed until the idle topology node is found; (2) when the left and right sides are out of bounds, find the mapping position by modulus value. Figure 3 shows the movement adjustment of a nine-task application map. Suppose that the initial task mapping relationship of the application is task 1 corresponding to topology node 1, task 2 corresponding to topology node 2, and so on. Task 9 corresponding to topology node 9, and the speeds of task 1 to task 9 are +2, +1, −3, +5, −1, −3, +2, +1, +4, respectively.   It can be seen from the figure that the speed value of task 1 is "+2", so the position moves two grids to the right to reach topology node 3; the speed value of task 2 is "+1", and movesone1 grid to the right, but the topology node has been occupied by task 1, so task 2 continues to move to the right until the empty topology node 4 is found. The speed value of task 3 is "−3", which moves three spaces to the left, but causes the boundary crossing, so the modulus reaches the position of topology node 9. The rest of the tasks are moved according to this rule. After nine moves, they become the mapping situation corresponding to step 9 in the figure.
After the particle swarm optimization algorithm generates the optimal solution, simulated annealing is needed to prevent it from falling into the local minimum. In the implementation, two tasks in the random exchange mapping solution are used to avoid the local optimal solution, which is similar to the gene mutation mode in the genetic algorithm. Figure 4 shows this process. On the basis of the optimal particle mapping solution, task 5 and task 9 are exchanged, and the network crosstalk value is calculated for the adjusted mapping solution.
Electronics 2020, 9, x FOR PEER REVIEW 4 of 10 It can be seen from the figure that the speed value of task 1 is "+2", so the position moves two grids to the right to reach topology node 3; the speed value of task 2 is "+1", and movesone1 grid to the right, but the topology node has been occupied by task 1, so task 2 continues to move to the right until the empty topology node 4 is found. The speed value of task 3 is "−3", which moves three spaces to the left, but causes the boundary crossing, so the modulus reaches the position of topology node 9. The rest of the tasks are moved according to this rule. After nine moves, they become the mapping situation corresponding to step 9 in the figure.
After the particle swarm optimization algorithm generates the optimal solution, simulated annealing is needed to prevent it from falling into the local minimum. In the implementation, two tasks in the random exchange mapping solution are used to avoid the local optimal solution, which is similar to the gene mutation mode in the genetic algorithm. Figure 4 shows this process. On the basis of the optimal particle mapping solution, task 5 and task 9 are exchanged, and the network crosstalk value is calculated for the adjusted mapping solution.  If the crosstalk is better than that of the original optimal particle mapping, then the mapping method is regarded as the optimal mapping. Otherwise, the adjusted mapping is considered as a vicious solution and accepted with a certain probability. In 1953, Metropolis proposed the Metropolis criterion, in which new states are accepted with certain probability. The probability of modifying the optimal mapping scheme is shown in Equation (5).
where f(i) and f(j) are the objective function values corresponding to i and j; and t is the temperature control parameter. The initial temperature t is 3000, and the temperature decreasing rate is 0.98. When solving the maximum value, if the function value f(j), corresponding to the current solution j, is greater than the function value f(i), corresponding to the previous solution i, the new solution i will be accepted directly; otherwise, the new solution will be accepted with a certain probability.

Realization of Heuristic Fusion Mapping Algorithm for Crosstalk Optimization
As a classical heuristic mapping algorithm, the essence of particle swarm optimization is to use the three messages of the current solution, individual optimal solution and global optimal solution to guide particles to optimize and update, among which balancing individual experience and group experience is the key to the optimization of the algorithm. The particle swarm optimization (PSO) algorithm has the advantages of fast searching speed, high efficiency and convenient adjustment of related parameters, but it is prone to premature convergence, resulting in local optimal solution. The simulated annealing algorithm described above has a great advantage in solving the problem of falling into the local optimal solution. It will accept the deteriorating solution with a certain probability, and this probability will gradually decrease with the decrease in temperature parameters. By introducing the deteriorating solution, the local optimal can be effectively avoided.
Considering that the simulated annealing algorithm can make up for the particle swarm optimization algorithm, the two algorithms can be combined in the design of network mapping algorithm on chip. The operation flow of our heuristic fusion mapping algorithm (Algorithm 1) is shown in Figure 5, and can be summarized into nine steps.
Step 1: Generate some mapping schemes randomly. If the crosstalk is better than that of the original optimal particle mapping, then the mapping method is regarded as the optimal mapping. Otherwise, the adjusted mapping is considered as a vicious solution and accepted with a certain probability. In 1953, Metropolis proposed the Metropolis criterion, in which new states are accepted with certain probability. The probability of modifying the optimal mapping scheme is shown in Equation (5).
where f(i) and f(j) are the objective function values corresponding to i and j; and t is the temperature control parameter. The initial temperature t is 3000, and the temperature decreasing rate is 0.98. When solving the maximum value, if the function value f(j), corresponding to the current solution j, is greater than the function value f(i), corresponding to the previous solution i, the new solution i will be accepted directly; otherwise, the new solution will be accepted with a certain probability.

Realization of Heuristic Fusion Mapping Algorithm for Crosstalk Optimization
As a classical heuristic mapping algorithm, the essence of particle swarm optimization is to use the three messages of the current solution, individual optimal solution and global optimal solution to guide particles to optimize and update, among which balancing individual experience and group experience is the key to the optimization of the algorithm. The particle swarm optimization (PSO) algorithm has the advantages of fast searching speed, high efficiency and convenient adjustment of related parameters, but it is prone to premature convergence, resulting in local optimal solution. The simulated annealing algorithm described above has a great advantage in solving the problem of falling into the local optimal solution. It will accept the deteriorating solution with a certain probability, and this probability will gradually decrease with the decrease in temperature parameters. By introducing the deteriorating solution, the local optimal can be effectively avoided.
Considering that the simulated annealing algorithm can make up for the particle swarm optimization algorithm, the two algorithms can be combined in the design of network mapping algorithm on chip. The operation flow of our heuristic fusion mapping algorithm (Algorithm 1) is shown in Figure 5, and can be summarized into nine steps.
Step 1: Generate some mapping schemes randomly.
Step 2: Initialize the minimum crosstalk and optimal mapping of each particle. Compare to get the global and historical minimum crosstalk and optimal mapping. Mark the optimal particle.
Step 3: Choose different optimization methods according to whether it is the optimal particle. If it is the optimal particle, skip to step 6, otherwise step 4.
Step 4: Update the particle's speed and position with reference to the above rules.
Electronics 2020, 9, x FOR PEER REVIEW 5 of 10 Step 2: Initialize the minimum crosstalk and optimal mapping of each particle. Compare to get the global and historical minimum crosstalk and optimal mapping. Mark the optimal particle.
Step 3: Choose different optimization methods according to whether it is the optimal particle. If it is the optimal particle, skip to step 6, otherwise step 4.
Step 4: Update the particle's speed and position with reference to the above rules. Figure 5. Crosstalk optimization process.
Step 5: Calculate crosstalk noise of each particle, update the minimum crosstalk noise of each particle, record the mapping mode at the same time, skip to step 8. Step 5: Calculate crosstalk noise of each particle, update the minimum crosstalk noise of each particle, record the mapping mode at the same time, skip to step 8.
Step 6: Randomly exchange the mapping positions of two tasks in the optimal particle. If the crosstalk at this time is less than the minimum crosstalk of the particle, update the minimum crosstalk and mapping of the particle, otherwise update with a certain probability.
Step 7: Determine whether all particles in the particle swarm have been updated. If yes, perform step 9. Otherwise, go to step 3.
Step 8: Update the global minimum crosstalk noise and mapping mode, and relabel the optimal particles. If the global minimum crosstalk is less than the historical minimum crosstalk, update the historical minimum crosstalk and historical optimal mapping mode.
Step 9: At the end of the iteration, the optimal solution is returned. Otherwise, the inertia factor ω of PSO and the temperature parameters t of SA are updated according to the above rules and the third step is executed.
Below is the pseudo code: Input: Iteration number: I T ; Population size: N P Parameter: Inertia factor: ω; Learning factor: C 1 , C 2 ; Random value: R 1 , R 2 ; Temperature control value: T; Individual minimum crosstalk: C P ; Global minimum crosstalk: C G ; Current iteration number: I C ; Number of traversed particles: N C ; Individual optimal mapping: M P ; Global optimal mapping: M G ; Current particle label: L C ; Optimal particle label: L O Output: Historical minimum crosstalk: C H ; Historical optimal mapping: M H

Procedure
Generate mapping solutions of initial particle swarm randomly; Calculate the crosstalk corresponding to each particle mapping; Record the initial minimum crosstalk C P and mapping mode M P ; Obtain global minimum crosstalk C G and global optimal mapping M G by comparsion and label the optimal particle L O ; Initialize historical minimum crosstalk C H and historical optimal mapping M H while (I C < I T ) while (N C < N P ) if (L C == L O ) Exchange two task mapping positions randomly and calculate the crosstalk; if (crosstalk < C P ) Update C P and M P ; else Update C P and M P according to the Metropolis criterion endif else Update speed and position of each particle; Calculate the crosstalk of particle and update C P , M P endif end Update C G and M G by comparsion; Update C H and M H ; Change ω and T end return M H

Simulation Results
We used the simulation software PhoNocMap [30] to build the network crosstalk simulation platform. In different applications [29] and topologies, the designed mapping algorithm was compared with particle swarm optimization mapping algorithm and genetic mapping algorithm. The Optimized Unfolded Torus topology used in the experiment was mentioned in [31]. The parameter configuration of the simulation environment is shown in Table 1. Some multimedia applications will be used in the experiment, and their structure is shown in Figure 6. configuration of the simulation environment is shown in Table 1. Some multimedia applications will be used in the experiment, and their structure is shown in Figure 6.  The applications used in the experiment and their topological scale were referred to in [29,32]. The topology scale of PIP is 3 × 3, 263 dec, 263 enc, MPEG4, MWD and VOPD are 4 × 4, the topology scale of Wavelet is 5 × 5, and the topology scale of DVOPD is 6 x 6. The experimental results are shown in Figure 7. The applications used in the experiment and their topological scale were referred to in [29,32]. The topology scale of PIP is 3 × 3, 263 dec, 263 enc, MPEG4, MWD and VOPD are 4 × 4, the topology scale of Wavelet is 5 × 5, and the topology scale of DVOPD is 6 × 6. The experimental results are shown in Figure 7.  Figure 7a shows the crosstalk comparison of mapping algorithms under different topologies, and the application uses 263 dec. It can be seen that in the comparison of different topological structures, the crosstalk optimization performance of PSO algorithm and GA algorithm is not stable. In the mesh and optimized unfoldered torus structure, the performance of PSO algorithm is better than that of GA algorithm, while in the foldered torus and unfoldered torus structure, the performance of PSO algorithm is lower than that of GA algorithm. However, the crosstalk optimization performance of PSO algorithm is stable, which is better than the PSO algorithm and GA algorithm. Compared with the PSO algorithm and GA algorithm, the crosstalk optimization performance of the PSO algorithm is 7.2%, 1.5%, 0.4% and 2.3% lower than mesh, folded torus, unfoldered torus and optimized unfoldered torus respectively, so the performance of the PSO algorithm is better than that of the PSO algorithm and GA algorithm. Figure 7b shows the crosstalk comparison of mapping algorithms under different applications. The topology is mesh. It can be seen that in the comparison of different applications, the overall crosstalk optimization performance of the PSO algorithm is the worst; the main reason for this is that the PSO algorithm is easy to fall into local minimum, and there is no obvious local minimum problem in the GA algorithm, so the overall performance of crosstalk optimization is better than the PSO algorithm. Based on the PSO algorithm, PSO_SA improves the local minimum problem in the PSO algorithm, and is better than the PSO algorithm and GA algorithm in the performance of crosstalk optimization. Compared with the GA algorithm, in addition to the application of MPEG4 and pip, MWD, and 263 dec, 263 enc, VOPD, Wavelet and DVOPD, the crosstalk is reduced by 7.3%, 4.8%,  Figure 7a shows the crosstalk comparison of mapping algorithms under different topologies, and the application uses 263 dec. It can be seen that in the comparison of different topological structures, the crosstalk optimization performance of PSO algorithm and GA algorithm is not stable. In the mesh and optimized unfoldered torus structure, the performance of PSO algorithm is better than that of GA algorithm, while in the foldered torus and unfoldered torus structure, the performance of PSO algorithm is lower than that of GA algorithm. However, the crosstalk optimization performance of PSO algorithm is stable, which is better than the PSO algorithm and GA algorithm. Compared with the PSO algorithm and GA algorithm, the crosstalk optimization performance of the PSO algorithm is 7.2%, 1.5%, 0.4% and 2.3% lower than mesh, folded torus, unfoldered torus and optimized unfoldered torus respectively, so the performance of the PSO algorithm is better than that of the PSO algorithm and GA algorithm. Figure 7b shows the crosstalk comparison of mapping algorithms under different applications. The topology is mesh. It can be seen that in the comparison of different applications, the overall crosstalk optimization performance of the PSO algorithm is the worst; the main reason for this is that the PSO algorithm is easy to fall into local minimum, and there is no obvious local minimum problem in the GA algorithm, so the overall performance of crosstalk optimization is better than the PSO algorithm. Based on the PSO algorithm, PSO_SA improves the local minimum problem in the PSO algorithm, and is better than the PSO algorithm and GA algorithm in the performance of crosstalk optimization. Compared with the GA algorithm, in addition to the application of MPEG4 and pip, MWD, and 263 dec, 263 enc, VOPD, Wavelet and DVOPD, the crosstalk is reduced by 7.3%, 4.8%, 3.0%, 10.6% and 28.7%, respectively, so PSO_SA is better than the PSO algorithm and GA algorithm in the optimization of crosstalk in different applications.

Conclusions
In this paper, a heuristic fusion mapping algorithm for crosstalk optimization is proposed, which combines particle swarm optimization (PSO) and simulated annealing (SA) to obtain a global optimal solution that is not easy to fall into local minimum, so as to reduce the crosstalk of the optical network-on-chip. Experimental results show that the crosstalk optimization performance of PSO_SA mapping algorithm is better than that of PSO mapping algorithm and GA mapping algorithm, both in topology comparison and application comparison. Therefore, the validity of the mapping algorithm is proved.