Simple Deterministic Selection-Based Genetic Algorithm for Hyperparameter Tuning of Machine Learning Models
Abstract
:1. Introduction
- 1.
- A simple deterministic selection genetic algorithm (SDSGA) was developed as an improved optimization technique for computing optimal HP values with a small number of fitness evaluations.
- 2.
- The proposed SDSGA has shown a high exploitative capability which makes it able to search more deeply within a region in the search space without sacrificing its exploratory capabilities.
2. Related Work
3. Methodology
3.1. Crossover and Mutation in GA
3.2. Proposed Improved Genetic Algorithm (SDSGA)
Algorithm 1 Pseudocode of the selection algorithm. | |
Require: population | |
Ensure: parent array | |
1: | Sort the population in decreasing order |
2: | Create array of zeros to contain parents |
3: | Set added, count to 0 |
4: | for each individual in population do |
5. | if individual does not exist in parent array then |
6. | Add copy of individual to parent array |
7. | Increment added |
8. | end if |
9. | if added is equal to parent size then |
10. | Break |
11. | end if |
12. | end for |
4. Experimental Components
4.1. Description of Benchmark Functions
4.2. Description of the Datasets
4.3. Overview of the Metaheuristic Optimization Algorithms
4.3.1. Particle Swarm Optimization (PSO)
4.3.2. Genetic Algorithm (GA)
4.3.3. Biogeography Based Optimization (BBO)
4.3.4. Metrics of Analysis for Optimizing Benchmark Functions
4.4. Overview of the Machine Learning Classifiers
4.4.1. Tuning the Hyperparameters of CNN Model
4.4.2. Tuning the Hyperparameters of the Random Forest Model
4.4.3. Metrics of Analysis for HP Optimization
5. Results and Discussion
5.1. Effect of the Population Size on SDSGA Performance
5.2. Comparison and Evaluation of Optimization Algorithms on Benchmark Function
5.3. CNN on the MNIST Dataset
5.4. Random Forest on UCI Datasets
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Khurana, D.; Koli, A.; Khatter, K.; Singh, S. Natural Language Processing: State of The Art, Current Trends and Challenges. arXiv 2017, arXiv:1708.05148. [Google Scholar]
- Friedman, C.; Rindflesch, T.C.; Corn, M. Natural language processing: State of the art and prospects for significant progress, a workshop sponsored by the National Library of Medicine. J. Biomed. Inform. 2013, 46, 765–773. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Khan, Z.A.; Feng, Z.; Irfan Uddin, M.; Mast, N.; Shah, S.A.A.; Imtiaz, M.; Al-Khasawneh, M.A.; Mahmoud, M. Optimal policy learning for disease prevention using reinforcement learning. Sci. Prog. 2020, 2020, 1–13. [Google Scholar] [CrossRef]
- Mendoza, H.; Klein, A.; Feurer, M.; Springenberg, J.T.; Hutter, F. Towards Automatically-Tuned Neural Networks. In Automated Machine Learning; Springer: Cham, Switzerland, 2016. [Google Scholar]
- Thornton, C.; Hutter, F.; Hoos, H.H.; Leyton-Brown, K. Auto-WEKA: Combined selection and hyperparameter optimization of classification algorithms. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Chicago, IL, USA, 11–14 August 2013; Volume Part F128815, pp. 847–855. [Google Scholar] [CrossRef]
- Loshchilov, I.; Schoenauer, M.; Sebag, M. BI-Population CMA-ES Algorithms with Surrogate Models and Line Searches. In Proceedings of the 15th Annual Conference Companion on Genetic and Evolutionary Computation, Amsterdam, The Netherlands, 6–10 July 2013; pp. 1177–1184. [Google Scholar] [CrossRef]
- Yu, T.; Zhu, H. Hyper-Parameter Optimization: A Review of Algorithms and Applications. arXiv 2020, arXiv:2003.05689. [Google Scholar]
- Levesque, J.C.; Durand, A.; Gagne, C.; Sabourin, R. Bayesian optimization for conditional hyperparameter spaces. In Proceedings of the International Joint Conference on Neural Networks, Anchorage, AK, USA, 14–19 May 2017; pp. 286–293. [Google Scholar] [CrossRef]
- Bergstra, J.; Bengio, Y. Random search for hyper-parameter optimization. J. Mach. Learn. Res. 2012, 13, 281–305. [Google Scholar]
- Bergstra, J.; Yamins, D.; Cox, D. Hyperopt: A Python Library for Optimizing the Hyperparameters of Machine Learning Algorithms. In Proceedings of the 12th Python in Science Conference, Austin, TX, USA, 24–29 June 2013; pp. 13–19. [Google Scholar] [CrossRef] [Green Version]
- Wu, J.; Chen, X.Y.; Zhang, H.; Xiong, L.D.; Lei, H.; Deng, S.H. Hyperparameter optimization for machine learning models based on Bayesian optimization. J. Electron. Sci. Technol. 2019, 17, 26–40. [Google Scholar] [CrossRef]
- Loshchilov, I.; Hutter, F. CMA-ES for Hyperparameter Optimization of Deep Neural Networks. arXiv 2016, arXiv:1604.07269. [Google Scholar]
- Bergstra, J.; Bardenet, R.; Bengio, Y.; Kégl, B. Algorithms for hyper-parameter optimization. Adv. Neural Inf. Process. Syst. 2011, 24, 2546–2554. [Google Scholar]
- Wang, J.; Xu, J.; Wang, X. Combination of Hyperband and Bayesian Optimization for Hyperparameter Optimization in Deep Learning. arXiv 2018, arXiv:1801.01596. [Google Scholar]
- Jin, H.; Song, Q.; Hu, X. Auto-keras: An efficient neural architecture search system. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Anchorage, AK, USA, 4–8 August 2019; pp. 1946–1956. [Google Scholar] [CrossRef] [Green Version]
- Snoek, J.; Larochelle, H.; Adams, R.P. Practical Bayesian optimization of machine learning algorithms. Adv. Neural Inf. Process. Syst. 2012, 4, 2951–2959. [Google Scholar]
- Li, L.; Jamieson, K.; Rostamizadeh, A.; Gonina, E.; Hardt, M.; Recht, B.; Talwalkar, A. A System for Massively Parallel Hyperparameter Tuning. arXiv 2018, arXiv:1810.05934. [Google Scholar]
- Li, L.; Jamieson, K.; DeSalvo, G.; Rostamizadeh, A.; Talwalkar, A. Hyperband: A novel bandit-based approach to hyperparameter optimization. J. Mach. Learn. Res. 2018, 18, 6765–6816. [Google Scholar]
- Jamieson, K.; Talwalkar, A. Non-stochastic best arm identification and hyperparameter optimization. In Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, AISTATS 2016, Cadiz, Spain, 9–11 May 2016; pp. 240–248. [Google Scholar]
- Klein, A.; Falkner, S.; Bartels, S.; Hennig, P.; Hutter, F. Fast Bayesian hyperparameter optimization on large datasets. Electron. J. Stat. 2017, 11, 4945–4968. [Google Scholar] [CrossRef]
- Friedrichs, F.; Igel, C. Evolutionary tuning of multiple SVM parameters. Neurocomputing 2005, 64, 107–117. [Google Scholar] [CrossRef]
- Li, Y.; Zhang, Y. Hyper-parameter estimation method with particle swarm optimization. arXiv 2020, arXiv:2011.11944. [Google Scholar]
- Bacanin, N.; Bezdan, T.; Tuba, E.; Strumberger, I.; Tuba, M. Optimizing convolutional neural network hyperparameters by enhanced swarm intelligence metaheuristics. Algorithms 2020, 13, 67. [Google Scholar] [CrossRef] [Green Version]
- Han, J.; Gondro, C.; Reid, K.; Steibel, J.P. Heuristic hyperparameter optimization of deep learning models for genomic prediction. G3 Genes Genomes Genet. 2021, 11, 398800. [Google Scholar] [CrossRef]
- Lorenzo, P.R.; Nalepa, J.; Ramos, L.S.; Pastor, J.R. Hyper-parameter selection in deep neural networks using parallel particle swarm optimization. In Proceedings of the Genetic and Evolutionary Computation Conference Companion, Berlin, Germany, 15–19 July 2017; pp. 1864–1871. [Google Scholar] [CrossRef]
- Mantovani, R.G.; Horvath, T.; Cerri, R.; Vanschoren, J.; De Carvalho, A.C. Hyper-Parameter Tuning of a Decision Tree Induction Algorithm. In Proceedings of the 2016 5th Brazilian Conference on Intelligent Systems (BRACIS), Recife, Brazil, 9–12 October 2016; pp. 37–42. [Google Scholar] [CrossRef]
- Tani, L.; Rand, D.; Veelken, C.; Kadastik, M. Evolutionary algorithms for hyperparameter optimization in machine learning for application in high energy physics. Eur. Phys. J. C. 2021, 81, 1–9. [Google Scholar] [CrossRef]
- Feurer, M.; Klein, A.; Eggensperger, K.; Springenberg, J.T.; Blum, M.; Hutter, F. Auto-sklearn: Efficient and Robust Automated Machine Learning. In Automated Machine Learning; Springer International Publishing: Berlin, Germany, 2019; pp. 113–134. [Google Scholar] [CrossRef] [Green Version]
- Kotthoff, L.; Thornton, C.; Hoos, H.H.; Hutter, F.; Leyton-Brown, K. Auto-WEKA. In Automated Machine Learning; Springer International Publishing: Berlin, Germany, 2019; pp. 81–95. [Google Scholar] [CrossRef] [Green Version]
- Zimmer, L.; Lindauer, M.; Hutter, F. Auto-Pytorch: Multi-Fidelity MetaLearning for Efficient and Robust AutoDL. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 43, 3079–3090. [Google Scholar] [CrossRef]
- Bhandari, D.; Murthy, C.A.; Pal, S.K. Genetic Algorithm with Elitist Model and ITS Convergence. Int. J. Pattern Recognit. Artif. Intell. 1996, 10, 731–747. [Google Scholar] [CrossRef]
- Liang, Y.; Leung, K.S. Genetic Algorithm with adaptive elitist-population strategies for multimodal function optimization. Appl. Soft Comput. 2011, 11, 2017–2034. [Google Scholar] [CrossRef]
- Aibinu, A.M.; Bello Salau, H.; Rahman, N.A.; Nwohu, M.N.; Akachukwu, C.M. A novel Clustering based Genetic Algorithm for route optimization. Eng. Sci. Technol. Int. J. 2016, 19, 2022–2034. [Google Scholar] [CrossRef] [Green Version]
- Bello-Salau, H.; Aibinu, A.M.; Wang, Z.; Onumanyi, A.J.; Onwuka, E.N.; Dukiya, J.J. An optimized routing algorithm for vehicle ad-hoc networks. Eng. Sci. Technol. Int. J. 2019, 22, 754–766. [Google Scholar] [CrossRef]
- Allawi, Z.T.; Ibraheem, I.K.; Humaidi, A.J. Fine-tuning meta-heuristic algorithm for global optimization. Processes 2019, 7, 657. [Google Scholar] [CrossRef] [Green Version]
- Arıcı, F.; Kaya, E. Comparison of Meta-heuristic Algorithms on Benchmark Functions. Acad. Perspect. Procedia 2019, 2, 508–517. [Google Scholar] [CrossRef] [Green Version]
- LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2323. [Google Scholar] [CrossRef] [Green Version]
- Frank, A.; Asuncion, A. UCI Machine Learning Repository; University of California: Irvine, CA, USA, 2010. [Google Scholar]
- McCall, J. Genetic algorithms for modelling and optimisation. J. Comput. Appl. Math. 2005, 184, 205–222. [Google Scholar] [CrossRef]
- Eberhart; Shi, Y. Eberhart; Shi, Y. Particle swarm optimization: Developments, applications and resources. In Proceedings of the 2001 Congress on Evolutionary Computation (IEEE Cat. No.01TH8546), Seoul, Korea, 27–30 May 2001; Volume 1. [Google Scholar] [CrossRef]
- Ma, H.; Simon, D.; Siarry, P.; Yang, Z.; Fei, M. Biogeography-Based Optimization: A 10-Year Review. IEEE Trans. Emerg. Top. Comput. Intell. 2017, 1, 391–407. [Google Scholar] [CrossRef]
- Kennedy, J.; Eberhart, R. Particle swarm optimization. In Proceedings of the ICNN’95—International Conference on Neural Networks, Perth, WA, Australia, 27 November–1 December 1995. [Google Scholar] [CrossRef]
- Man, K.F.; Tang, K.S.; Kwong, S. Genetic algorithms: Concepts and applications [in engineering design]. IEEE Trans. Ind. Electron. 1996, 43, 519–534. [Google Scholar] [CrossRef]
- Singh, V.K.; Sharma, V. Elitist Genetic Algorithm Based Energy Balanced Routing Strategy to Prolong Lifetime of Wireless Sensor Networks. Chin. J. Eng. 2014, 2014, 1–6. [Google Scholar] [CrossRef] [Green Version]
- Simon, D. Biogeography-Based Optimization. IEEE Trans. Evol. Comput. 2009, 12, 702–713. [Google Scholar] [CrossRef] [Green Version]
- Loh, W. Classification and regression trees. WIREs Data Min. Knowl. Discov. 2011, 1, 14–23. [Google Scholar] [CrossRef]
- Zhang, L.; Suganthan, P.N. Oblique Decision Tree Ensemble via Multisurface Proximal Support Vector Machine. IEEE Trans. Cybern. 2015, 45, 2165–2176. [Google Scholar] [CrossRef] [PubMed]
- Menze, B.H.; Kelm, B.M.; Splitthoff, D.N.; Koethe, U.; Hamprecht, F.A. On Oblique Random Forests; Springer: Berlin/Heidelberg, Germany, 2011. [Google Scholar] [CrossRef] [Green Version]
- Katuwal, R.; Suganthan, P.N.; Zhang, L. Heterogeneous oblique random forest. Pattern Recognit. 2020, 99, 107078. [Google Scholar] [CrossRef]
- Bai, J.; Li, Y.; Li, J.; Yang, X.; Jiang, Y.; Xia, S.T. Multinomial random forest. Pattern Recognit. 2022, 122, 108331. [Google Scholar] [CrossRef]
S/N | Name | Number of Instances | Number of Features | Missing Values? |
---|---|---|---|---|
1 | CAR | 1728 | 6 | NO |
2 | Breast Cancer | 569 | 32 | NO |
3 | Adult | 48,842 | 14 | YES |
4 | Letter | 20,000 | 16 | NO |
5 | Cover Type | 581,012 | 54 | NO |
Algorithms | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
BBO | GA | PSO | SDSGA | |||||||
S/N | Functions | Global Sol | Mean | Std | Mean | Std | Mean | Std | Mean | Std |
1 | Ackley | 0 | 5.81 | 7.96 | 2.20 | 6.19 | 1.41 | 8.04 | 1.53 | 2.70 |
2 | DeJongsF1 | 0 | 1.86 | 1.35 | 5.96 | 4.76 | 3.62 | 1.44 | 4.96 | 5.88 |
3 | DeJongsF2 | 0 | 2.54 | 2.03 | 1.07 | 8.16 | 3.27 | 3.90 | 6.53 | 4.13 |
4 | Ellipsoid | 0 | 7.67 | 4.95 | 6.94 | 8.76 | 1.67 | 1.29 | 1.16 | 2.54 |
5 | Griewank | 0 | −6.90 | 5.30 | −8.43 | 1.02 | −4.62 | 2.31 | −9.58 | 1.72 |
6 | Hyper Ellipsodic | 0 | 8.90 | 2.15 | 4.52 | 3.17 | 4.01 | 2.44 | 1.32 | 1.48 |
7 | KTablet | 0 | 4.96 | 1.18 | 9.35 | 1.13 | 1.39 | 7.81 | 1.09 | 1.40 |
8 | Michalewicz | −1.8013 | −9.29 | 7.26 | −9.31 | 9.63 | −3.81 | 4.16 | −9.43 | 1.74 |
9 | Rastrigin | 0 | 5.11 | 9.01 | 1.42 | 5.77 | 9.08 | 3.64 | 7.23 | 1.55 |
10 | Rosenbrock | 0 | 2.54 | 2.03 | 1.07 | 8.16 | 3.27 | 3.90 | 6.53 | 4.13 |
11 | Schwefel | −418.9829 | −6.36 | 2.37 | −6.36 | 9.51 | −5.34 | 3.46 | −6.36 | 1.12 |
12 | Sphere | 0 | 1.86 | 1.35 | 5.96 | 4.76 | 3.62 | 1.44 | 4.96 | 5.88 |
13 | StyblinskiTang | −391.66165 | −180.32 | 1.13 | −3.56 | 2.86 | −2.99 | 4.58 | −3.75 | 1.13 |
14 | SumOfDiffPower | 0 | 1.73 | 1.59 | 1.05 | 1.09 | 3.84 | 6.75 | 8.66 | 1.48 |
15 | WeightedSphere | 0 | 8.90 | 2.15 | 4.52 | 3.17 | 4.01 | 2.44 | 1.32 | 1.48 |
16 | XinSheYang | 0 | 3.56 | 4.00 | 2.84 | 7.37 | 1.60 | 9.39 | 3.01 | 1.02 |
17 | Zakharov | 0 | −2.32 | 3.53 | −2.52 | 4.65 | −1.94 | 4.50 | −1.91 | 5.81 |
PSO (%) | GA (%) | BBO (%) | BO (%) | SDSGA (%) | |
---|---|---|---|---|---|
1 | 99.04 | 99.04 | 99 | 98.95 | 99.04 |
2 | 99.04 | 99.17 | 99.12 | 98.99 | 99.18 |
3 | 99.04 | 99.17 | 99.12 | 98.99 | 99.18 |
4 | 99.04 | 99.17 | 99.12 | 98.99 | 99.18 |
5 | 99.14 | 99.17 | 99.16 | 99.01 | 99.2 |
6 | 99.14 | 99.17 | 99.16 | 99.05 | 99.2 |
7 | 99.14 | 99.17 | 99.16 | 99.05 | 99.2 |
8 | 99.14 | 99.17 | 99.17 | 99.06 | 99.2 |
9 | 99.15 | 99.17 | 99.18 | 99.06 | 99.2 |
10 | 99.15 | 99.17 | 99.18 | 99.14 | 99.2 |
A: CAR | ||||
---|---|---|---|---|
Algorithm | Time (s) | Best Pos | Best Sol (%) | Iter |
PSO | 14.02 | [174, 6] | 97.399 | 5 |
GA | 27.65 | [163, 6] | 97.399 | 6 |
BBO | 38.5 | [120, 6] | 97.399 | 2 |
BO | 79.47 | [129, 6] | 97.688 | 1 |
SDSGA | 22.48 | [117, 6] | 97.688 | 3 |
B: BREAST CANCER | ||||
Algorithm | Time (s) | Best Pos | Best Sol (%) | Iter |
PSO | 60.43 | [129, 7] | 98.246 | 3 |
GA | 47.95 | [143, 7] | 98.246 | 7 |
BBO | 62.92 | [118, 16] | 98.246 | 4 |
BO | 106.4 | [68, 20] | 98.246 | 5 |
SDSGA | 41.37 | [124, 7] | 98.246 | 2 |
C: LETTER | ||||
Algorithm | Time (s) | Best Pos | Best Sol (%) | Iter |
PSO | 787.68 | [198, 5] | 96.45 | 3 |
GA | 427.81 | [141, 5] | 96.45 | 5 |
BBO | 599.96 | [91, 4] | 96.5 | 7 |
BO | 616.72 | [141, 5] | 96.45 | 10 |
SDSGA | 422.75 | [85, 4] | 96.5 | 2 |
D: ADULT | ||||
Algorithm | Time (s) | Best Pos | Best Sol (%) | Iter |
PSO | 763.43 | [184, 15] | 85.86 | 4 |
GA | 780 | [138, 15] | 85.79 | 3 |
BBO | 953.09 | [193, 15] | 85.87 | 8 |
BO | 942.51 | [183, 15] | 85.843 | 4 |
SDSGA | 655.41 | [184, 15] | 85.86 | 7 |
E: Cover Type | ||||
Algorithm | Time (s) | Best Pos | Best Sol (%) | Iter |
PSO | 11,847.65 | [198, 19] | 93.1 | 8 |
GA | 5481.84 | [155, 18] | 93.081 | 6 |
BBO | 11,721.25 | [193, 18] | 93.092 | 7 |
BO | 13,437.37 | [188, 20] | 93.133 | 3 |
SDSGA | 10,907.01 | [194, 20] | 93.129 | 6 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Raji, I.D.; Bello-Salau, H.; Umoh, I.J.; Onumanyi, A.J.; Adegboye, M.A.; Salawudeen, A.T. Simple Deterministic Selection-Based Genetic Algorithm for Hyperparameter Tuning of Machine Learning Models. Appl. Sci. 2022, 12, 1186. https://doi.org/10.3390/app12031186
Raji ID, Bello-Salau H, Umoh IJ, Onumanyi AJ, Adegboye MA, Salawudeen AT. Simple Deterministic Selection-Based Genetic Algorithm for Hyperparameter Tuning of Machine Learning Models. Applied Sciences. 2022; 12(3):1186. https://doi.org/10.3390/app12031186
Chicago/Turabian StyleRaji, Ismail Damilola, Habeeb Bello-Salau, Ime Jarlath Umoh, Adeiza James Onumanyi, Mutiu Adesina Adegboye, and Ahmed Tijani Salawudeen. 2022. "Simple Deterministic Selection-Based Genetic Algorithm for Hyperparameter Tuning of Machine Learning Models" Applied Sciences 12, no. 3: 1186. https://doi.org/10.3390/app12031186
APA StyleRaji, I. D., Bello-Salau, H., Umoh, I. J., Onumanyi, A. J., Adegboye, M. A., & Salawudeen, A. T. (2022). Simple Deterministic Selection-Based Genetic Algorithm for Hyperparameter Tuning of Machine Learning Models. Applied Sciences, 12(3), 1186. https://doi.org/10.3390/app12031186