# Quality-Oriented Study on Mapping Island Model Genetic Algorithm onto CUDA GPU

^{1}

^{2}

^{3}

^{4}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Related Work

#### 2.1. UA-FLP

#### 2.2. Introduction to GPU and CUDA

#### 2.3. Parallelizaiton of IMGA for UA-FLP on a GPU

## 3. Improvement of Parallel Tournament Selection

#### 3.1. Generate Random Seed Once with the System Clock

Algorithm 1: Generate Random Seed Once |

Data: |

round: number of tournaments that are conducted |

population: number of chromosomes that are in an island |

Input: |

survive; fitness |

Function: |

clock(): access the system time from GPU |

rand_initiate(): initiate the state of random number generator |

rand_generate(): generate a random number |

Result: |

Chromosomes are randomly picked, and conduct a series of tournament. |

Parallel: |

1: tseed = clock() |

2: offset = threadId |

3: rand_initiate(tseed, offset, state) |

4: winner = 0 |

5: for i = 1 to i = round do |

6: rival = rand_generate(state) mod population |

7: if fitness[blockId][rival] < fitness[blockId][winner] do |

8: winner = rival |

9: end |

10: end |

11: survive[blockId][threadId] = winner |

#### 3.2. Generate a Random Seed during Each Round of Comparisons

Algorithm 2: Generate a Random Seed during Each Round of Comparison |

Data: |

round: number of tournaments that are conducted |

population: number of chromosomes that are in an island |

Input: |

survive; fitness |

Function: |

clock(): access system time from GPU |

rand_initiate(): initiate the state of random number generator |

rand_generate(): generate a random number using the state of generator |

Result: |

Chromosomes are randomly picked, and conduct a series of tournament. |

Parallel: |

1: offset = threadId |

2: winner = 0 |

3: for i = 1 to i = round do |

4: tseed = clock() |

5: rand_initiate(tseed, offset, state) |

6: rival = rand_generate(state) mod population |

7: if fitness[blockId][rival] < fitness[blockId][winner] do |

8: winner = rival |

9: end |

10: end |

11: survive[blockId][threadId] = winner |

#### 3.3. Our Improved Strategy

Algorithm 3: Our Improved Strategy |

Data: |

round: number of tournaments that are conducted |

population: number of chromosomes that are in an island |

Input: |

survive; fitness |

Function: |

clock(): access system time from GPU |

rand_initiate(): initiate the state of random number generator |

rand_generate(): generate a random number using the state of generator |

Result: |

Chromosomes are randomly picked, and conduct a series of tournament. |

Parallel: |

1: tseed = clock() + blockId * population + threadId |

2: offset = 0 |

3: rand_initiate(tseed, offset, state) |

4: winner = 0 |

5: for i = 1 to i = round do |

6: rival = rand_generate(state) mod population |

7: if fitness[blockId][rival] < fitness[blockId][winner] do |

8: winner = rival |

9: end |

10: end |

11: survive[blockId][threadId] = winner |

## 4. Performance Evaluations

#### 4.1. Tested Platform

#### 4.2. Evalutions of Parallel Tournament Selection

#### 4.3. Evaluations of Quality Improvement

#### 4.3.1. Influence of the Number of Islands

#### 4.3.2. Influence of the Number of Iterations

#### 4.3.3. Influence of the Number of Chromosomes Per Island

#### 4.3.4. Effect of Combining Suggested Parameter Settings

## 5. Conclusions

## Author Contributions

## Funding

## Conflicts of Interest

## References

**Figure 2.**FBS codes for the example in Figure 1.

**Figure 4.**CUDA memories and hierarchy model [38].

Intel Xeon CPU E5-2609 v3 | GTX 980 | ||
---|---|---|---|

Number of Cores | 6 | Number of GPUs | 1 |

Number of Threads | 6 | Thread Processors | 2048 |

Clock Speed | 1.9 GHz | Clock Speed | 1127 MHz |

Memory Size | 16 GB | Memory Size | 4 GB |

Memory Type | DDR4 | Memory Type | GDDR5 |

No. | Problem Data Set | Number of Facilities | Facility Size | Common Shape Constraint | Distance Measure | |
---|---|---|---|---|---|---|

Width Length | ||||||

1 | F40 | 40 | 45.00 | 45.00 | α = 1000 | Rectilinear |

2 | F60 | 60 | 45.00 | 45.00 | α = 1000 | Rectilinear |

3 | F80 | 80 | 45.00 | 45.00 | α = 1000 | Rectilinear |

4 | F100 | 100 | 45.00 | 45.00 | α = 1000 | Rectilinear |

5 | F120 | 120 | 45.00 | 45.00 | α = 1000 | Rectilinear |

Parameter | Value |
---|---|

Total population size per island (n) | 32, 64, 128, 256, 512,1024 |

Number of islands (N) | 16, 32, 64, 128, 256 |

Cycle generations (c) | 64, 128, 256, 512, 1024, 2048, 4096, 8192 |

Migration rate (m) | 5 |

Migration frequency (g) | 15 |

Crossover probability (p_{c}) | 0.7 |

Mutation probability (p_{m}) | 0.01 |

Benchmark | NQ | NT | QTR |
---|---|---|---|

40 | 1.0097 | 0.2390 | 2.0788 |

60 | 1.0266 | 0.0752 | 3.7712 |

80 | 1.0020 | 0.0518 | 4.2739 |

100 | 1.0332 | 0.0410 | 4.6543 |

120 | 1.0316 | 0.0350 | 4.8822 |

