Matrix-Based ACO for Solving Parametric Problems Using Heterogeneous Reconfigurable Computers and SIMD Accelerators
Abstract
:1. Introduction
2. Literature Review
2.1. Effective Modifications of Ant Colony Optimization
2.2. Parallel Modifications of ACO, Running on Central Processing Unit (CPU) and Graphics Processing Unit (GPU), and Using Open Multi-Processing (OpenMP)
2.3. Modifications of the Ant Colony for Parametric Optimization
2.4. Review of Metaheuristic Algorithms Applied to Optimization Problems
2.5. Features of the Current State of ACO
3. Materials and Methods
3.1. Statement of Parametric Problem in Matrix Form
3.2. Ant Colony Optimization
3.3. Modification of ACO in Matrix Formulation
3.3.1. New Probability Formula Without Taking into Account the Heuristic Parameter
3.3.2. Matrix Formalization of the Method
3.3.3. Modifications of the Method for the Parametric Optimization of a System with Negative Values of the Objective Function
3.4. Modification of the Ant Method Using a Hash Table
- ACOCN (ACO Cluster New): This is classic ACO that uses a hash table and obtains the values of the objective function without accessing the computing cluster. If , then .
- ACOCNI (ACO Cluster New Ignor): If the ant agent has found the already-considered solution, then this ant agent does not change the state of the dynamic layers described by matrices and , i.e., it is ignored. If , then .
- ACOCCyN (ACO Cluster Cycle N): If the ant agent has found the already-considered solution, then it performs a further cyclic search for a new solution. The cycle is limited to iterations; if a new solution is not found, then the solution is ignored.
- ACOCCyI (ACO Cluster Cycle Infinity): If the agent ant has found a solution that has already been considered, then it performs a further cyclic search until a new solution is found.
3.5. Matrix Modification of the Ant Colony Method for Running on SIMD
- Calculation of matrix : The following are performed sequentially: the calculation of or , to obtain the values of the normalized matrix or , respectively; calculation of matrix and then vector ; and calculation of the transition probability matrix and the distribution function matrix . Since at the first stage all matrices have the dimension and vectors have the dimension , then all actions of the algorithm are performed in parallel on threads, where each thread performs operations for one parameter. When optimizing the algorithm, one can refuse to calculate matrix and calculate vector simultaneously with matrix .
- Calculation of the ant agent decisions and : The following are performed sequentially: the generation of matrix and calculation of position ; and the determination of and the calculation of vectors and , necessary for working with the hash table. One of the modifications of ACOCN, ACOCNI, ACOCCyN, and ACOCCyI is performed, based on the results on which vector is determined. This stage can be performed in threads on SIMD and MIMD computers. In this case, the operations necessary for calculating matrix , within each thread can be performed on additional threads for each thread of the ant agent.
- Calculation of the new values of matrices and : First, all values of matrix are reduced—evaporation ,—and then, the values from ant agents are added: . The values of matrix are changed in a similar way: . This algorithm can be executed on parallel threads, each of which calculates the values of a separate parameter.
3.6. Graph Structure for Parametric Optimization
- Standard: The standard configuration for and uses 201 vertices. See Graph A in Figure 3.
- Separation of the real part: Each layer is divided into the integer part in the interval [–10, 10] with a step of 1 (21 vertices) and the fractional part in the interval [0, 0.9] with a step of 0.1 (10 vertices). . In total, there will be 2 layers for each parameter in the parametric graph. The total number of solutions will increase to 44,100 due to the appearance of several zeros. See Graph B in Figure 3.
- Selection of the negative part: Each layer is divided into the sign (2 vertices) and the positive part in the interval [–10…10] with a step of 0.1 (101 vertices). . In total, the parametric graph has 2 layers for each parameter and 40,804 vertices. See Graph C in Figure 3.
- Separation of integer, real, and signed parts: Each layer is divided into the sign (2 vertices), the integer part in the interval [0, 10] with a step of 1 (11 vertices), and the fractional in the interval [0, 0.9] with a step of 0.1 (10 vertices). . In total, the parametric graph will have 3 layers for each parameter and 48,400 solutions. There are 7999 more solutions here than in the standard graph, since there are “extra” solutions, for example, . See Graph D in Figure 3. This graph is intuitive. When increasing the precision of parameter discretization, for example, to 0.01, it is necessary to simply add the corresponding layers for each parameter in the interval [0…0.09] with a step of 0.01 (10 vertices).
- In addition to selecting the integer, real, and sign parts, it is possible to decompose layers in the intervals [0, 10]. As a result, the parametric graph will have 4 layers for each parameter. ; is a sign layer (2 vertices), and corresponds to one vertex, , from Graph D: comprise even numbers from the interval [0, 8], and takes 2 values, 0 or 1. This graph is designated in Figure 3 as Graph E.
- Further decomposition of not only the integer but also up to 5 layers for each parameter of the real part. . See Graph F in Figure 3.
4. Results
4.1. Analysis of the Efficiency of Application of the Proposed Modifications of ACO
4.1.1. Investigation of a New Probability Formula and the Influence of Additional Terms
4.1.2. Analysis of Parametric Graph Decomposition
4.1.3. Analysis of Modifications of the Ant Colony Method Using a Hash Table
4.1.4. Comparison of Proposed Modifications of ACO with Genetic Algorithm (GA), Particle Swarm Optimization (PSO), and Simulated Annealing (SA)
- ACOCCyI—. With the use of elitism, the number of elite ant agents is two times greater than the number of agents per iteration. Since the modifications under consideration require the discretization of parameters, the values of and were determined with an accuracy of , presented as Graph F with a total number of possible solutions of 1.96 × ;
- GA—An algorithm with Linear Crossover is used, in which Random Alpha is additionally determined; Rank Selection is carried out for a group of individuals, and Random Selection is carried out for each individual from the selected group, mutation with adaptive change, and the use of elitism. .
- PSO—.
- SA—.
4.2. Analysis of the Parallel Method of Ant Colonies
4.3. Analysis of the Optimal Structure of a Heterogeneous Computer Based on SIMD and MIMD Components
- Start of the algorithm and creation of the necessary data structures (Stage Start): Since this stage is performed in a single copy, its execution is possible only on SISD systems, and the acceleration of this stage is carried out using FPGA. When analyzing the efficiency of the system, the execution time of this stage is constant and mandatory; it can be neglected when constructing a heterogeneous computer.
- Iteration overhead (Stage Delt): This overhead is associated with counter incrementing, context switching, calculating the timing of nested stages, etc.
- The first stage (stage 1), associated with matrix transformations to obtain matrix , can be performed on SIMD computers.
- The second stage (stage 2), associated with the search for paths by ant agents, can be performed on SIMD computers in the absence of interaction with the hash table. When interacting with the hash table, the result of operations, the duration of individual transformations, and the subsequent behavior of the algorithm are undefined and depend on the results of the system’s operation. As a result, this stage is best implemented using an MIMD component or an SIMD accelerator and an MIMD component together.
- The third stage (stage 3), associated with updating matrices and , consists only of matrix transformations and can be performed on an SIMD accelerator.
4.4. Analysis of the Optimal Structure of a Heterogeneous Computer Taking into Account the Reconfiguration Mechanism
- A homogeneous MIMD structure of general-purpose cores without SIMD accelerators;
- A hybrid structure containing MIMD cores and SIMD accelerators, in which cores interact with one accelerator.
4.5. Analysis of the Efficiency of Matrix Modification of the Algorithm on a GPU Using CUDA Technology for the Case of Repeated Searches for Solutions by the ACOCCyN Algorithm
4.6. Application of Modifications of ACO in Searching for Optimal Values of the SARIMA Model Parameters
5. Discussion
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Appendix A
Total Number of Layers/Threads n | |||||||
---|---|---|---|---|---|---|---|
42.00 | 84.00 | 168.00 | 336.00 | 672.00 | 1344.00 | ||
Time CUDA stage 1, 2, 3 | Stage 3 | 398.46 | 409.29 | 419.56 | 440.14 | ||
Stage 2 | 1580.51 | 1801.93 | 3275.77 | 10,297.52 | |||
Stage 1 | 24.22 | 26.52 | 25.28 | 30.20 | |||
Loop While | 13.09 | 17.80 | 12.25 | 23.79 | |||
Load Data | 7.63 | 8.09 | 8.35 | 10.13 | |||
Total Time | 2023.91 | 2263.64 | 3741.21 | 10,801.78 | |||
Min | 0.00123 | 0.001247 | 0.011536 | 0.054558 | |||
Max | 0.99983 | 0.995116 | 0.983309 | 0.944857 | |||
Hash Iter. | 14,009.00 | 12,243.70 | 4753.13 | 3220.73 | |||
Time CUDA stage 1, 2, 3, non-hash | Stage 3 | 397.80 | 413.08 | 427.18 | 445.97 | 465.34 | |
Stage 2 | 41.23 | 65.29 | 227.94 | 719.26 | 2458.40 | ||
Stage 1 | 23.35 | 24.23 | 25.15 | 29.70 | 36.82 | ||
Loop While | 10.83 | 11.07 | 12.08 | 15.12 | 32.03 | ||
Load Data | 1.38 | 1.76 | 2.79 | 4.57 | 8.06 | ||
Total Time | 474.59 | 515.43 | 695.14 | 1214.63 | 3000.64 | ||
Min | 0.00123 | 0.001253 | 0.014033 | 0.055178 | 0.150092 | ||
Max | 0.999776 | 0.995115 | 0.984992 | 0.940983 | 0.852482 | ||
Hash Iter. | 0 | 0 | 0 | 0 | 0 | ||
Time CUDA stage 1, 2, 3, ant | Stage 3 | 375.14 | 384.51 | 383.95 | 393.59 | 426.37 | 0.82 |
Stage 2 | 1498.09 | 2329.93 | 3337.77 | 4705.20 | 11,795.84 | 334,390.93 | |
Stage 1 | 22.10 | 23.48 | 23.76 | 26.33 | 37.24 | 6.36 | |
Loop While | 9.52 | 11.20 | 10.71 | 11.85 | 31.71 | 43.96 | |
Load Data | 7.56 | 7.21 | 8.21 | 10.15 | 13.69 | 21.61 | |
Total Time | 1912.40 | 2756.34 | 3764.40 | 5147.12 | 12,304.85 | 334,463.68 | |
Min | 0.00123 | 0.001247 | 0.014066 | 0.063027 | 0.169566 | 0.023369 | |
Max | 0.999805 | 0.995115 | 0.981709 | 0.933226 | 0.838947 | 1 | |
Hash Iter. | 270.10 | 100.90 | 29.87 | 9.50 | 7.17 | 0.00 | |
Time CUDA stage optim 1, 2 | Stage 3 | 0 | 0 | 0 | 0 | ||
Stage 2 | 1637.56 | 1666.17 | 3563.10 | 9312.10 | |||
Stage 1 | 420.02 | 434.59 | 448.20 | 470.10 | |||
Loop While | 11.46 | 12.03 | 18.84 | 26.33 | |||
Load Data | 7.50 | 7.13 | 8.40 | 10.40 | |||
Total Time | 2076.53 | 2119.92 | 4038.53 | 9818.92 | |||
Min | 0.001858 | 0.010975 | 0.042044 | 0.105891 | |||
Max | 0.995116 | 0.982994 | 0.959404 | 0.900824 | |||
Hash Iter. | 13,792.53 | 10,252.70 | 5534.53 | 2430.67 | |||
Time CUDA stage optim 1, 2 non-hash | Stage 3 | 0 | 0 | 0 | 0 | 0 | |
Stage 2 | 41.57 | 65.52 | 228.89 | 719.10 | 2493.38 | ||
Stage 1 | 422.79 | 439.42 | 452.04 | 471.56 | 506.04 | ||
Loop While | 9.54 | 12.20 | 11.93 | 14.63 | 27.99 | ||
Load Data | 1.42 | 1.87 | 3.04 | 4.92 | 8.63 | ||
Total Time | 475.32 | 519.00 | 695.90 | 1210.21 | 3036.04 | ||
Min | 0.001813 | 0.010972 | 0.049105 | 0.104103 | 0.206183 | ||
Max | 0.995115 | 0.981823 | 0.958959 | 0.899234 | 0.795144 | ||
Hash Iter. | 0 | 0 | 0 | 0 | 0 | ||
Time CUDA stage optim 1, 2 ant | Stage 3 | 0 | 0 | 0 | 0 | 0 | 0 |
Stage 2 | 1525.88 | 2377.59 | 3315.67 | 4916.60 | 11,105.51 | 351,091.54 | |
Stage 1 | 399.78 | 411.47 | 407.96 | 416.96 | 458.57 | 7.14 | |
Loop While | 12.82 | 16.22 | 11.16 | 12.94 | 25.36 | 47.00 | |
Load Data | 7.69 | 7.48 | 8.45 | 10.39 | 14.00 | 22.29 | |
Total Time | 1946.16 | 2812.76 | 3743.23 | 5356.89 | 11,603.44 | 351,167.96 | |
Min | 0.001454 | 0.010891 | 0.041275 | 0.114905 | 0.221748 | 0.023205 | |
Max | 0.995116 | 0.989848 | 0.958871 | 0.887639 | 0.777684 | 1 | |
Hash Iter. | 277.40 | 100.57 | 30.90 | 12.23 | 6.03 | 0.00 | |
Time CUDA stage only 1 | Stage 3 | 0 | 0 | 0 | 0 | ||
Stage 2 | 0 | 0 | 0 | 0 | |||
Stage 1 | 5282.42 | 14,352.86 | 72,867.41 | 209,736.00 | |||
Load Data | 6.96 | 7.25 | 8.38 | 10.68 | |||
Total Time | 5289.38 | 14,360.11 | 72,875.80 | 209,746.68 | |||
Min | 0.00123 | 0.00123 | 0.001309 | 0.001244 | |||
Max | 1 | 1 | 1 | 1 | |||
Hash Iter. | 116,389.73 | 171,360.63 | 231,497.03 | 181,570.00 | |||
Time CUDA stage only 1 non-hash | Stage 3 | 0 | 0 | 0 | 0 | ||
Stage 2 | 0 | 0 | 0 | 0 | |||
Stage 1 | 1171.90 | 2058.45 | 4302.78 | 8908.62 | |||
Load Data | 1.51 | 1.94 | 2.97 | 4.94 | |||
Total Time | 1173.41 | 2060.39 | 4305.75 | 8913.55 | |||
Min | 0.00123 | 0.00123 | 0.00123 | 0.001243 | |||
Max | 1 | 1 | 1 | 1 | |||
Hash Iter. | 0 | 0 | 0 | 0 | |||
Time CUDA stage only 1 ant | Stage 3 | 0 | 0 | 0 | 0 | 0 | 0 |
Stage 2 | 0 | 0 | 0 | 0 | 0 | 0 | |
Stage 1 | 17,589.19 | 35,339.20 | 70,837.76 | 141,883.18 | 259,748.36 | 588,855.30 | |
Load Data | 7.03 | 7.31 | 8.41 | 10.39 | 13.46 | 23.44 | |
Total Time | 17,596.22 | 35,346.51 | 70,846.18 | 141,893.57 | 259,761.82 | 588,878.74 | |
Min | 0.00123 | 0.00123 | 0.011087 | 0.059468 | 0.162979 | 0.284906 | |
Max | 0.999859 | 0.995116 | 0.990186 | 0.945066 | 0.835363 | 0.713577 | |
Hash Iter. | 0 | 0 | 0 | 0 | 0 | 0 | |
Time CPU | Stage 3 | 32.97 | 68.99 | 140.77 | 301.70 | 741.37 | 1563.50 |
Stage 2 | 512.40 | 993.15 | 1907.94 | 3574.80 | 7126.65 | 13,870.08 | |
Stage 1 | 0.78 | 1.62 | 2.92 | 5.59 | 11.22 | 22.85 | |
Loop While | 0.08 | 0.14 | 0.18 | 0.28 | 0.49 | 0.86 | |
Load Data | 46.67 | 47.57 | 46.81 | 46.63 | 51.41 | 58.94 | |
Total Time | 592.90 | 1111.48 | 2098.61 | 3929.00 | 7931.13 | 15,516.24 | |
Time Hash | 0.05 | 0.08 | 0.16 | 0.29 | 0.52 | 0.95 | |
Min | 0.00123 | 0.001367 | 0.013709 | 0.065035 | 0.159804 | 0.280988 | |
Max | 0.999653 | 0.995115 | 0.982099 | 0.935464 | 0.83754 | 0.722882 | |
Hash Iter. | 0 | 0 | 0 | 0 | 0 | 0 | |
Time CPU non-hash | Stage 3 | 33.10 | 68.53 | 138.66 | 314.52 | 736.33 | 1538.50 |
Stage 2 | 424.82 | 868.04 | 1680.91 | 3288.69 | 6557.77 | 12,862.94 | |
Stage 1 | 0.80 | 1.52 | 2.86 | 5.77 | 11.44 | 22.96 | |
Loop While | 0.10 | 0.15 | 0.19 | 0.36 | 0.61 | 0.94 | |
Load Data | 0.53 | 0.78 | 1.41 | 3.31 | 6.67 | 13.03 | |
Total Time | 459.34 | 939.03 | 1824.02 | 3612.65 | 7312.82 | 14,438.38 | |
Min | 0.00123 | 0.001293 | 0.013281 | 0.059288 | 0.165373 | 0.282038 | |
Max | 0.999562 | 0.995116 | 0.982377 | 0.936127 | 0.838962 | 0.720628 | |
Hash Iter. | 0 | 0 | 0 | 0 | 0 | 0 | |
Time classic ACO | Stage 3 | 143.98 | 231.15 | 552.89 | 866.84 | 1936.69 | 3792.35 |
Stage 2 | 6894.60 | 13,165.32 | 25,277.69 | 47,287.40 | 91,749.72 | 179,784.40 | |
Stage 1 | 2.26 | 4.31 | 8.35 | 14.76 | 29.63 | 57.29 | |
Loop While | 51.22 | 56.83 | 108.94 | 154.08 | 232.73 | 221.89 | |
Load Data | 4961.68 | 5049.75 | 5138.72 | 5076.45 | 5086.60 | 5369.54 | |
Total Time | 12,053.75 | 18,507.36 | 31,086.59 | 53,399.53 | 99,035.37 | 189,225.46 | |
Time Hash | 249.82 | 393.10 | 652.63 | 994.56 | 1773.78 | 3589.73 | |
Min | 0.00123 | 0.001282 | 0.013092 | 0.060569 | 0.161687 | 0.279406 | |
Max | 0.99962 | 0.995115 | 0.983967 | 0.93474 | 0.837908 | 0.723496 | |
Hash Iter. | 0 | 0 | 0 | 0 | 0 | 0 | |
Time classic ACO non-hash | Stage 3 | 142.37 | 228.90 | 578.23 | 868.42 | 1998.53 | 3881.50 |
Stage 2 | 6659.43 | 13,066.10 | 25,677.61 | 46,287.54 | 92,190.28 | 182,687.75 | |
Stage 1 | 2.29 | 4.05 | 8.92 | 14.63 | 30.10 | 59.11 | |
Loop While | 48.01 | 51.86 | 63.37 | 72.68 | 91.67 | 173.04 | |
Load Data | 0.76 | 1.36 | 2.75 | 4.88 | 9.36 | 18.68 | |
Total Time | 6852.85 | 13,352.27 | 26,330.88 | 47,248.15 | 94,319.93 | 186,820.07 | |
Min | 0.00123 | 0.001394 | 0.013548 | 0.062985 | 0.159422 | 0.280427 | |
Max | 0.999643 | 0.995115 | 0.982411 | 0.936318 | 0.837717 | 0.72298 | |
Hash Iter. | 0 | 0 | 0 | 0 | 0 | 0 |
Function | Ant Agents | 50 | 100 | 150 | 200 | 250 | 300 | 350 | 400 | 450 | 500 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Iterations | 2000 | 1000 | 666 | 500 | 400 | 333 | 285 | 250 | 222 | 200 | ||
Rastrygin | ACO | 2.29 × 10−13 | 1.26 × 10−8 | 3.17 × 10−6 | 2.11 × 10−5 | 1.02 × 10−5 | 3.03 × 10−5 | 2.42 × 10−5 | 1.29 × 10−4 | 1.27 × 10−4 | 1.01 × 10−4 | |
CI | 6.95 × 10−15 | 1.54 × 10−10 | 5.18 × 10−8 | 3.40 × 10−7 | 8.34 × 10−8 | 3.80 × 10−7 | 7.09 × 10−7 | 1.80 × 10−6 | 4.70 × 10−7 | 8.68 × 10−7 | ||
GA | 3.03 × 10−1 | 1.16 × 10−1 | 5.20 × 10−2 | 2.75 × 10−2 | 1.77 × 10−2 | 1.17 × 10−2 | 7.89 × 10−3 | 6.49 × 10−3 | 5.79 × 10−3 | 4.27 × 10−3 | ||
CI | 1.41 × 10−3 | 7.73 × 10−4 | 3.14 × 10−4 | 1.88 × 10−4 | 1.12 × 10−4 | 7.59 × 10−5 | 5.37 × 10−5 | 4.35 × 10−5 | 3.73 × 10−5 | 2.83 × 10−5 | ||
PSO | 1.47 × 10−1 | 1.83 × 10−2 | 6.02 × 10−3 | 2.04 × 10−6 | 1.39 × 10−15 | 2.45 × 10−16 | 1.10 × 10−14 | 1.76 × 10−13 | 7.90 × 10−12 | 1.17 × 10−10 | ||
CI | 1.38 × 10−3 | 5.24 × 10−4 | 3.02 × 10−4 | 1.36 × 10−5 | 6.06 × 10−7 | 2.71 × 10−8 | 1.21 × 10−9 | 5.42 × 10−11 | 2.43 × 10−12 | 6.20 × 10−13 | ||
SA | 4.51 × 10−3 | 1.02 × 10−2 | 3.17 × 10−2 | 8.90 × 10−2 | 1.57 × 10−1 | 2.02 × 10−1 | 2.08 × 10−1 | 2.20 × 10−1 | 2.30 × 10−1 | 2.35 × 10−1 | ||
CI | 1.74 × 10−5 | 4.04 × 10−5 | 1.31 × 10−4 | 3.53 × 10−4 | 6.05 × 10−4 | 7.58 × 10−4 | 7.97 × 10−4 | 8.61 × 10−4 | 9.10 × 10−4 | 8.68 × 10−4 | ||
Rosenbrock | ACO | 2.54 × 10−10 | 7.61 × 10−10 | 2.60 × 10−4 | 7.40 × 10−3 | 1.36 × 10−2 | 1.25 × 10−2 | 4.54 × 10−4 | 1.44 × 10−2 | 8.27 × 10−3 | 9.97 × 10−3 | |
CI | 9.96 × 10−12 | 2.67 × 10−12 | 9.95 × 10−6 | 2.83 × 10−4 | 3.45 × 10−4 | 2.67 × 10−4 | 7.37 × 10−6 | 3.11 × 10−4 | 1.57 × 10−4 | 1.81 × 10−4 | ||
GA | 4.96 × 10−2 | 2.64 × 10−2 | 1.83 × 10−2 | 1.17 × 10−2 | 1.05 × 10−2 | 7.82 × 10−3 | 6.88 × 10−3 | 6.28 × 10−3 | 5.24 × 10−3 | 4.82 × 10−3 | ||
CI | 2.22 × 10−4 | 1.18 × 10−4 | 8.98 × 10−5 | 5.20 × 10−5 | 4.98 × 10−5 | 3.16 × 10−5 | 2.93 × 10−5 | 2.74 × 10−5 | 2.18 × 10−5 | 2.18 × 10−5 | ||
PSO | 3.95 × 10−2 | 4.53 × 10−3 | 4.49 × 10−4 | 1.65 × 10−5 | 1.53 × 10−7 | 8.25 × 10−10 | 2.97 × 10−12 | 1.88 × 10−10 | 5.03 × 10−14 | 2.14 × 10−12 | ||
CI | 5.92 × 10−4 | 1.29 × 10−4 | 1.96 × 10−5 | 1.56 × 10−6 | 7.08 × 10−8 | 3.17 × 10−9 | 1.42 × 10−10 | 1.76 × 10−11 | 7.89 × 10−13 | 1.36 × 10−13 | ||
SA | 2.11 × 10−4 | 4.31 × 10−4 | 6.26 × 10−4 | 6.82 × 10−4 | 8.50 × 10−4 | 8.57 × 10−4 | 9.97 × 10−4 | 1.24 × 10−3 | 9.38 × 10−4 | 1.08 × 10−3 | ||
CI | 8.21 × 10−7 | 1.51 × 10−6 | 2.20 × 10−6 | 2.60 × 10−6 | 3.33 × 10−6 | 3.11 × 10−6 | 3.52 × 10−6 | 6.24 × 10−6 | 2.74 × 10−6 | 3.96 × 10−6 | ||
Corn | ACO | 5.47 × 10−7 | 2.03 × 10−4 | 7.43 × 10−4 | 2.72 × 10−3 | 2.84 × 10−3 | 3.14 × 10−3 | 5.53 × 10−3 | 6.39 × 10−3 | 1.24 × 10−2 | 8.62 × 10−3 | |
CI | 7.43 × 10−9 | 2.33 × 10−6 | 2.21 × 10−6 | 1.41 × 10−5 | 6.90 × 10−6 | 9.06 × 10−6 | 2.25 × 10−5 | 1.95 × 10−5 | 4.67 × 10−5 | 3.74 × 10−5 | ||
GA | 3.17 × 10−3 | 6.45 × 10−3 | 7.06 × 10−3 | 7.89 × 10−3 | 8.32 × 10−3 | 8.17 × 10−3 | 8.04 × 10−3 | 8.05 × 10−3 | 8.63 × 10−3 | 8.80 × 10−3 | ||
CI | 1.37 × 10−5 | 1.77 × 10−4 | 1.77 × 10−4 | 1.77 × 10−4 | 1.77 × 10−4 | 1.76 × 10−4 | 1.76 × 10−4 | 1.76 × 10−4 | 1.76 × 10−4 | 1.76 × 10−4 | ||
PSO | 1.18 × 10−2 | 2.95 × 10−3 | 2.24 × 10−3 | 1.63 × 10−3 | 1.07 × 10−3 | 8.10 × 10−4 | 2.64 × 10−4 | 3.92 × 10−4 | 3.38 × 10−4 | 2.88 × 10−4 | ||
CI | 1.21 × 10−4 | 1.81 × 10−4 | 1.79 × 10−4 | 1.78 × 10−4 | 1.77 × 10−4 | 1.77 × 10−4 | 1.75 × 10−4 | 1.76 × 10−4 | 1.76 × 10−4 | 1.76 × 10−4 | ||
SA | 1.61 × 10−2 | 5.02 × 10−2 | 7.61 × 10−2 | 6.56 × 10−2 | 7.74 × 10−2 | 7.35 × 10−2 | 6.84 × 10−2 | 6.91 × 10−2 | 7.14 × 10−2 | 6.87 × 10−2 | ||
CI | 3.37 × 10−5 | 1.01 × 10−4 | 1.36 × 10−4 | 1.23 × 10−4 | 1.39 × 10−4 | 1.47 × 10−4 | 1.23 × 10−4 | 1.36 × 10−4 | 1.39 × 10−4 | 1.25 × 10−4 | ||
Bird | ACO | 7.14 × 10−2 | 3.43 × 10−1 | 2.37 × 10−1 | 2.76 × 10−1 | 7.56 × 10−2 | 1.66 × 10−1 | 2.62 × 10−1 | 5.69 × 10−2 | 1.87 × 10−1 | 1.90 × 10−1 | |
CI | 2.22 × 10−4 | 1.00 × 10−2 | 2.04 × 10−3 | 4.66 × 10−3 | 4.90 × 10−4 | 1.22 × 10−3 | 5.84 × 10−3 | 4.45 × 10−4 | 1.39 × 10−3 | 3.03 × 10−3 | ||
GA | 8.76 × 10−1 | 4.27 × 10−1 | 3.73 × 10−1 | 1.92 × 10−1 | 1.31 × 10−1 | 1.55 × 10−1 | 8.14 × 10−2 | 5.36 × 10−2 | 6.76 × 10−2 | 5.65 × 10−2 | ||
CI | 7.36 × 10−3 | 4.20 × 10−3 | 3.29 × 10−3 | 1.68 × 10−3 | 9.06 × 10−4 | 1.28 × 10−3 | 7.32 × 10−4 | 4.46 × 10−4 | 6.55 × 10−4 | 4.33 × 10−4 | ||
PSO | 1.05 × 100 | 1.26 × 10−1 | 3.24 × 10−2 | 2.26 × 10−2 | 1.62 × 10−2 | 8.71 × 10−3 | 8.39 × 10−3 | 7.55 × 10−3 | 4.46 × 10−3 | 6.63 × 10−3 | ||
CI | 1.58 × 10−2 | 1.92 × 10−2 | 1.87 × 10−2 | 1.87 × 10−2 | 1.87 × 10−2 | 1.87 × 10−2 | 1.87 × 10−2 | 1.87 × 10−2 | 1.87 × 10−2 | 1.87 × 10−2 | ||
SA | 3.38 × 10−3 | 5.46 × 10−3 | 6.83 × 10−3 | 9.84 × 10−3 | 1.92 × 10−2 | 3.01 × 10−2 | 4.17 × 10−2 | 3.87 × 10−2 | 4.81 × 10−2 | 4.10 × 10−2 | ||
CI | 1.30 × 10−5 | 2.04 × 10−5 | 3.18 × 10−5 | 3.09 × 10−5 | 7.85 × 10−5 | 1.07 × 10−4 | 1.78 × 10−4 | 1.50 × 10−4 | 1.86 × 10−4 | 1.44 × 10−4 | ||
Ackley | ACO | 6.15 × 10−8 | 3.06 × 10−6 | 7.05 × 10−5 | 2.28 × 10−4 | 5.63 × 10−4 | 8.31 × 10−4 | 4.63 × 10−4 | 9.63 × 10−4 | 1.16 × 10−3 | 5.90 × 10−4 | |
CI | 3.50 × 10−9 | 4.77 × 10−9 | 3.72 × 10−7 | 6.14 × 10−7 | 2.19 × 10−6 | 2.12 × 10−6 | 3.38 × 10−6 | 4.96 × 10−6 | 7.20 × 10−6 | 8.47 × 10−6 | ||
GA | 8.59 × 10−5 | 9.85 × 10−5 | 3.35 × 10−5 | 3.17 × 10−5 | 2.29 × 10−5 | 2.19 × 10−5 | 1.69 × 10−5 | 1.26 × 10−5 | 1.47 × 10−5 | 1.42 × 10−5 | ||
CI | 5.63 × 10−7 | 8.14 × 10−7 | 1.93 × 10−7 | 2.78 × 10−7 | 1.16 × 10−7 | 1.59 × 10−7 | 1.27 × 10−7 | 6.39 × 10−8 | 8.18 × 10−8 | 7.64 × 10−8 | ||
PSO | 5.42 × 10−5 | 5.77 × 10−11 | 0.00 × 100 | 0.00 × 100 | 3.59 × 10−13 | 4.80 × 10−11 | 1.52 × 10−9 | 1.75 × 10−8 | 1.29 × 10−7 | 5.89 × 10−7 | ||
CI | 3.61 × 10−6 | 3.51 × 10−3 | 3.51 × 10−3 | 3.51 × 10−3 | 3.51 × 10−3 | 3.51 × 10−3 | 3.51 × 10−3 | 3.51 × 10−3 | 3.51 × 10−3 | 3.51 × 10−3 | ||
SA | 1.20 × 10−2 | 1.62 × 10−2 | 5.24 × 10−2 | 1.03 × 10−1 | 1.12 × 10−1 | 1.20 × 10−1 | 1.32 × 10−1 | 1.17 × 10−1 | 1.14 × 10−1 | 1.48 × 10−1 | ||
CI | 2.85 × 10−5 | 3.82 × 10−5 | 1.12 × 10−4 | 2.97 × 10−4 | 2.71 × 10−4 | 3.19 × 10−4 | 3.41 × 10−4 | 3.38 × 10−4 | 2.73 × 10−4 | 3.74 × 10−4 |
References
- Maniezzo, V.; Boschetti, M.A.; Stützle, T. Matheuristics: Algorithms and Implementations; Springer: Cham, Switzerland, 2021; 214p. [Google Scholar] [CrossRef]
- Simon, D. Evolutionary Optimization Algorithms: Biologically Inspired and Population-Based Approaches to Computer Intelligence; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2013; p. 741. [Google Scholar]
- Colorni, A.; Dorigo, M.; Maniezzo, V. Distributed Optimization by Ant Colonies. In Proceedings of the First European Conference on Artificial Life, Paris, France, 11–13 December 1991; Varela, F., Bourgine, P., Eds.; Elsevier Publishing: Amsterdam, The Netherlands, 1992; pp. 134–142. [Google Scholar]
- Dorigo, M.; Stützle, T. Ant Colony Optimization; MIT Press: Cambridge, MA, USA, 2004; p. 321. [Google Scholar]
- Uslu, M.O.; Erdoğdu, K. Ant Colony Optimization and Beam-Ant Colony Optimization on Traveling Salesman Problem with Traffic Congestion. DEUFMD 2024, 26, 519–527. [Google Scholar] [CrossRef]
- Sagban, R.F.; Ku-Mahamud, K.R.; Abu Bakar, M.S. Reactive max-min ant system with recursive local search and its application to TSP and QAP. Intell. Autom. Soft Comput. 2017, 23, 127–134. [Google Scholar] [CrossRef]
- Ghimire, B.; Mahmood, A.; Elleithy, K. Hybrid Parallel Ant Colony Optimization for Application to Quantum Computing to Solve Large-Scale Combinatorial Optimization Problems. Appl. Sci. 2023, 13, 11817. [Google Scholar] [CrossRef]
- Črepinšek, M.; Liu, S.-H.; Mernik, M. Exploration and Exploitation in Evolutionary Algorithms: A Survey. ACM Comput. Surv. 2013, 45, 35. [Google Scholar] [CrossRef]
- Dorigo, M.; Birattari, M. Swarm intelligence. Scholarpedia 2007, 2, 1462. [Google Scholar] [CrossRef]
- Pellegrini, P.; Stützle, T.; Birattari, M. A critical analysis of parameter adaptation in ant colony optimization. Swarm Intell. 2012, 6, 23–48. [Google Scholar] [CrossRef]
- Danesh, M.; Danesh, S. Optimal design of adaptive neuro-fuzzy inference system using PSO and ant colony optimization for estimation of uncertain observed values. Soft Comput. 2024, 28, 135–152. [Google Scholar] [CrossRef]
- Yin, C.; Fang, Q.; Li, H.; Peng, Y.; Xu, X.; Tang, D. An optimized resource scheduling algorithm based on GA and ACO algorithm in fog computing. J. Supercomput. 2024, 80, 4248–4285. [Google Scholar] [CrossRef]
- Bullnheimer, B.; Kotsis, G.; Strauß, C. Parallelization strategies for the ant system. Appl. Optim. 1998, 24, 87–100. [Google Scholar] [CrossRef]
- Randall, M.; Lewis, A. A parallel implementation of ant colony optimization. J. Parallel Distrib. Comput. 2002, 62, 421–1432. [Google Scholar] [CrossRef]
- Abouelfarag, A.A.; Aly, W.M.; Elbialy, A.G. Performance Analysis and Tuning for Parallelization of Ant Colony Optimization by Using OpenMP. In Proceedings of the Computer Information Systems and Industrial Management CISIM 2015, Warsaw, Poland, 24–26 September 2015; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2015; Volume 9339. [Google Scholar] [CrossRef]
- Mansour, I.B.; Alaya, I.B.; Tagina, M. A New Parallel Hybrid MultiObjective Ant Colony Algorithm Based on OpenMP. In Proceedings of the 17th International Conference on Applied Computing (AC2020), Lisbon, Portugal, 18–20 November 2020; pp. 19–26. [Google Scholar] [CrossRef]
- Mehne, H. Evaluation of parallelism in ant colony optimization method for numerical solution of optimal control problems. J. Electr. Eng. Electron. Control. Comput. Sci. 2015, 1, 15–20. [Google Scholar]
- Cecilia, J.M.; Nisbet, A.; Amos, M.; García, J.M.; Ujaldón, M. Enhancing GPU parallelism in nature-inspired algorithms. J. Supercomput. 2013, 63, 773–789. [Google Scholar] [CrossRef]
- Bai, H.; OuYang, D.; Li, X.; He, L.; Yu, H. MAX-MIN ant system on GPU with CUDA. In Proceedings of the 2009 Fourth International Conference on Innovative Computing, Information and Control (ICICIC), Kaohsiung, Taiwan, 7–9 December 2009; IEEE: Piscataway, NJ, USA, 2009; pp. 801–804. [Google Scholar] [CrossRef]
- Zhou, Y.; He, F.; Hou, H.; Qiu, Y. Parallel ant colony optimization on multi-core SIMD CPUs. Future Gener. Comput. Syst. 2018, 79, 473–487. [Google Scholar] [CrossRef]
- Skinderowicz, R. Implementing a GPU-based parallel MAX–MIN ant system. Future Gener. Comput. Syst. 2020, 106, 277–295. [Google Scholar] [CrossRef]
- Zhi, Z.; Yuxing, C.; Kwok, C.L.; Hui, L.; Jinwei, W. A Fast Fully Parallel Ant Colony Optimization Algorithm Based on CUDA for Solving TSP. IET Comput. Digit. Tech. 2023, 9915769, 14. [Google Scholar] [CrossRef]
- Tsutsui, S. ACO on Multiple GPUs with CUDA for Faster Solution of QAPs. In Proceedings of the Parallel Problem Solving from Nature—PPSN XII. PPSN 2012, Taormina, Italy, 1–5 September 2012; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2012; Volume 7492. [Google Scholar] [CrossRef]
- De Melo Menezes, B.A.; Herrmann, N.; Kuchen, H.; de Lima Neto, F.B. High-Level Parallel Ant Colony Optimization with Algorithmic Skeletons. Int. J. Parallel. Prog. 2021, 49, 776–801. [Google Scholar] [CrossRef]
- Shan, H. A novel travel route planning method based on an ant colony optimization algorithm. Open Geosci. 2023, 15, 20220541. [Google Scholar] [CrossRef]
- Yang, L.; Jiang, T.; Cheng, R. Tensorized ant colony optimization for GPU acceleration. In Proceedings of the Genetic and Evolutionary Computation Conference Companion, Melbourne, Australia, 14–18 July 2024; pp. 755–758. [Google Scholar] [CrossRef]
- Cecilia, J.M.; Llanes, A.; Abellán, J.L.; Gómez-Luna, J.; Chang, L.W.; Hwu, W.M.W. High-throughput Ant Colony Optimization on graphics processing units. J. Parallel Distrib. Comput. 2018, 113, 261–274. [Google Scholar] [CrossRef]
- Felipe, T.; Ricardo, B.; Paulo, G.; Marco, M. Efficient exploitation of the Xeon Phi architecture for the Ant Colony Optimization (ACO) metaheuristic. J. Supercomput. 2017, 73, 5053–5070. [Google Scholar] [CrossRef]
- Ivars, D.; Tatiana, K. Accelerating supply chains with Ant Colony Optimization across range of hardware solutions. arXiv 2020, arXiv:2001.08102. [Google Scholar] [CrossRef]
- ElSaid, A.; Wild, B.; El Jamiy, F.; Higgins, J.; Desell, T. Using ant colony optimization to optimize long short-term memory recurrent neural networks. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO ’18), Kyoto, Japan, 15–19 July 2018; Association for Computing Machinery: New York, NY, USA, 2018; pp. 13–20. [Google Scholar] [CrossRef]
- Hwang, W.; Kang, D.; Kim, D. Brain lateralisation feature extraction and ant colony optimisation-bidirectional LSTM network model for emotion recognition. IET Signal Process 2022, 16, 45–61. [Google Scholar] [CrossRef]
- Youness, H.; Osama, M.; Hussein, A.; Moness, M.; Hassan, A.M. An Effective SAT Solver Utilizing ACO Based on Heterogenous Systems. IEEE Access 2020, 8, 102920–102934. [Google Scholar] [CrossRef]
- Jincheng, G.; Weimin, P. Traffic Flow Prediction Based on ACO-BI-LSTM. In Proceedings of the Artificial Intelligence in China. AIC 2022, Changbaishan, China, 23–24 July 2022; Lecture Notes in Electrical Engineering; Springer: Berlin/Heidelberg, Germany, 2023; Volume 871, pp. 1–10. [Google Scholar] [CrossRef]
- Adabor, E.; Ackora-Prah, J. A Genetic Algorithm on Optimization Test Functions. Int. J. Mod. Eng. Res. 2017, 7, 1–11. [Google Scholar]
- Margaritis, K.G. An Experimental Study of Benchmarking Functions for Genetic Algorithms. Int. J. Comput. Math. 2002, 79, 403–416. [Google Scholar] [CrossRef]
- Jain, N.K.; Nangia, U.; Jain, J. Impact of Particle Swarm Optimization Parameters on its Convergence. In Proceedings of the 2018 2nd IEEE International Conference on Power Electronics, Intelligent Control and Energy Systems (ICPEICES), Delhi, India, 22–24 October 2018; pp. 921–926. [Google Scholar] [CrossRef]
- Chou, P. High-Dimension Optimization Problems Using Specified Particle Swarm Optimization. In Proceedings of the Advances in Swarm Intelligence. ICSI 2012, Shenzhen, China, 17–20 June 2012; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2012; Volume 7331. [Google Scholar] [CrossRef]
- Liqiang, L.; Yuntao, D.; Jinyu, G. Ant Colony Optimization Algorithm for Continuous Domains Based on Position Distribution Model of Ant Colony Foraging. Sci. World J. 2014, 2014, 428539. [Google Scholar] [CrossRef]
- Abdelbar, A.M.; Salama, K.M. Parameter Self-Adaptation in an Ant Colony Algorithm for Continuous Optimization. IEEE Access 2019, 7, 18464–18479. [Google Scholar] [CrossRef]
- Jairo, F.; Keiji, Y. An accelerated and robust algorithm for ant colony optimization in continuous functions. J. Braz. Comput. Soc. 2021, 27, 16. [Google Scholar] [CrossRef]
- Xinsen, Z.; Wenyong, G.; Ali Asghar, H.; Zhen-Nao, C.; Guoxi, L.; Huiling, C. Random following ant colony optimization: Continuous and binary variants for global optimization and feature selection. Appl. Soft Comput. 2023, 144, 110513. [Google Scholar] [CrossRef]
- Zulkifley, H.; Musirin, I.; Azman, A.; Othman, M. Continuous domain ant colony optimization for distributed generation placement and losses minimization. IAES Int. J. Artif. Intell. (IJ-AI) 2020, 9, 261. [Google Scholar] [CrossRef]
- Mu, M.; Duan, W.; Wang, B. Conditional nonlinear optimal perturbation and its applications. Nonlinear Process. Geophys. 2003, 10, 493–501. [Google Scholar] [CrossRef]
- Russell, S.J.; Norvig, P. Artificial Intelligence: A Modern Approach: Pearson Series In Artificial Intelligence, 4th ed.; Pearson: Hoboken, NJ, USA, 2021; p. 1245. [Google Scholar]
- Sinitsyn, I.N.; Titov, Y.P. Control of Set of System Parameter Values by the Ant Colony Method. Autom. Remote Control 2023, 84, 893–903. [Google Scholar] [CrossRef]
- Sudhanshu, M.K. Some New Test Functions for Global Optimization and Performance of Repulsive Particle Swarm Method; MPRA Paper; University Library of Munich: Munich, Germany, 2006. [Google Scholar] [CrossRef]
- Abdesslem, L. New Hard Benchmark Functions for Global Optimization. arXiv 2022, arXiv:2202.04606. [Google Scholar] [CrossRef]
- Chetverushkin, B.N.; Sudakov, V.A.; Titov, Y.P. Graph Condensation for Large Factor Models. Dokl. Math. 2024, 109, 246–251. [Google Scholar] [CrossRef]
Number of Iterations | Classic ACO | ACOCN | ACOCNI | ACOCCy3 | ACOCCyI | |
---|---|---|---|---|---|---|
2500 | 1.404 × 10−4 | 1.547 × 10−4 | 1.562 × 10−4 | 1.627 × 10−4 | 1.619 × 10−4 | |
CI | ±3.16 × 10−6 | ±1.28 × 10−6 | ±0.68 × 10−6 | ±1.08 × 10−6 | ±1.28 × 10−6 | |
5000 | 1.381 × 10−4 | 1.517 × 10−4 | 1.560 × 10−4 | 1.648 × 10−4 | 1.636 × 10−4 | |
CI | ±0.75 × 10−6 | ±0.85 × 10−6 | ±0.59 × 10−6 | ±0.81 × 10−6 | ±0.97 × 10−6 | |
7500 | 1.388 × 10−4 | 1.505 × 10−4 | 1.567 × 10−4 | 1.665 × 10−4 | 1.647 × 10−4 | |
CI | ±4.49 × 10−6 | ±0.66 × 10−6 | ±6.06 × 10−6 | ±2.10 × 10−6 | ±1.06 × 10−6 | |
10,000 | 1.391 × 10−4 | 1.501 × 10−4 | 1.578 × 10−4 | 1.690 × 10−4 | 1.654 × 10−4 | |
CI | ±3.07 × 10−6 | ±1.48 × 10−6 | ±4.81 × 10−6 | ±3.81 × 10−6 | ±0.91 × 10−6 | |
12,500 | 1.370 × 10−4 | 1.547 × 10−4 | 1.562 × 10−4 | 1.706 × 10−4 | 1.657 × 10−4 | |
CI | ±2.19 × 10−6 | ±5.34 × 10−6 | ±2.84 × 10−6 | ±5.38 × 10−6 | ±1.59 × 10−6 | |
15,000 | 1.364 × 10−4 | 1.526 × 10−4 | 1.569 × 10−4 | 1.700 × 10−4 | 1.650 × 10−4 | |
CI | ±7.17 × 10−6 | ±3.59 × 10−6 | ±2.20 × 10−6 | ±5.55 × 10−6 | ±1.23 × 10−6 | |
17,500 | 1.328 × 10−4 | 1.472 × 10−4 | 1.585 × 10−4 | 1.695 × 10−4 | 1.655 × 10−4 | |
CI | ±0.66 × 10−6 | ±0.53 × 10−6 | ±4.07 × 10−6 | ±4.52 × 10−6 | ±0.89 × 10−6 | |
20,000 | 1.325 × 10−4 | 1.469 × 10−4 | 1.582 × 10−4 | 1.741 × 10−4 | 1.688 × 10−4 | |
CI | ±0.60 × 10−6 | ±0.45 × 10−6 | ±4.84 × 10−6 | ±7.00 × 10−6 | ±6.44 × 10−6 |
Total Time | Total Number of Layers/Threads (n) | ||||||
---|---|---|---|---|---|---|---|
42 | 84 | 168 | 336 | 672 | 1344 | ||
CUDA stage 1, 2, 3 | 2023.91 | 2263.64 | 3741.21 | 10,801.78 | |||
CI | ±4.13 | ±9.65 | ±28.52 | ±95.35 | |||
CUDA stage optim 1, 2 | 2076.53 | 2119.91 | 4038.53 | 9818.92 | |||
CI | ±4.42 | ±7.42 | ±18.97 | ±67.52 | |||
CUDA stage only 1 | 5289.38 | 14,360.11 | 72,875.80 | 209,746.70 | |||
CI | ±120.73 | ±283.12 | ±208.27 | ±244.67 | |||
CPU | 549.55 | 1025.85 | 2053.47 | 3913.50 | 7996.14 | 16,090.64 | |
CI | ±1.93 | ±4.82 | ±4.15 | ±4.98 | ±7.84 | ±6.91 | |
Classic ACO | 6763.28 | 12,711.52 | 24,209.60 | 49,763.72 | 94,327.53 | 184,507.10 | |
CI | ±10.36 | ±27.78 | ±37.38 | ±40.43 | ±59.91 | ±122.48 | |
Speed-Up Relative to the Classic Implementation of Ant Colony Optimization | Total Number of Layers/Threads (n) | ||||||
42 | 84 | 168 | 336 | 672 | 1344 | ||
CUDA stage 1, 2, 3 | 3.34 | 5.62 | 6.47 | 4.61 | |||
CI | |||||||
CUDA stage optim 1, 2 | 3.26 | 6.00 | 5.99 | 5.07 | |||
CI | |||||||
CUDA stage only 1 | 1.28 | 0.89 | 0.33 | 0.24 | |||
CI | |||||||
CPU | 12.31 | 12.39 | 11.79 | 12.72 | 11.80 | 11.47 | |
CI |
Speeding Up the Algorithm Without Using a Hash Table | Total Number of Layers/Threads n | ||||||
---|---|---|---|---|---|---|---|
42 | 84 | 168 | 336 | 672 | 1344 | ||
CUDA stage 1, 2, 3 | 4.26 | 4.39 | 5.38 | 8.89 | |||
CI | ±0.25 | ±0.53 | ±1.20 | ±1.34 | |||
CUDA stage optim 1, 2 | 4.37 | 4.08 | 5.80 | 8.11 | |||
CI | ±0.26 | ±0.41 | ±0.80 | ±1.66 | |||
CUDA stage only 1 | 4.51 | 6.97 | 16.93 | 23.53 | |||
CI | ±0.44 | ±0.18 | ±0.13 | ±0.15 | |||
CPU | 1.22 | 1.17 | 1.14 | 1.12 | 1.10 | 1.07 | |
CI | ±0.10 | ±0.14 | ±0.06 | ±0.03 | ±0.03 | ±0.01 | |
Classic ACO | 1.04 | 1.03 | 1.02 | 1.02 | 1.02 | 0.98 | |
CI | ±0.04 | ±0.07 | ±0.07 | ±0.03 | ±0.05 | ±0.04 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Sudakov, V.; Titov, Y. Matrix-Based ACO for Solving Parametric Problems Using Heterogeneous Reconfigurable Computers and SIMD Accelerators. Mathematics 2025, 13, 1284. https://doi.org/10.3390/math13081284
Sudakov V, Titov Y. Matrix-Based ACO for Solving Parametric Problems Using Heterogeneous Reconfigurable Computers and SIMD Accelerators. Mathematics. 2025; 13(8):1284. https://doi.org/10.3390/math13081284
Chicago/Turabian StyleSudakov, Vladimir, and Yuri Titov. 2025. "Matrix-Based ACO for Solving Parametric Problems Using Heterogeneous Reconfigurable Computers and SIMD Accelerators" Mathematics 13, no. 8: 1284. https://doi.org/10.3390/math13081284
APA StyleSudakov, V., & Titov, Y. (2025). Matrix-Based ACO for Solving Parametric Problems Using Heterogeneous Reconfigurable Computers and SIMD Accelerators. Mathematics, 13(8), 1284. https://doi.org/10.3390/math13081284