Bound the Parameters of Neural Networks Using Particle Swarm Optimization
Abstract
:1. Introduction
2. The Proposed Method
2.1. Preliminaries
2.2. The Bounding Algorithm
2.3. The PSO Algorithm
Algorithm 1 A function in pseudocode to calculate the quantity for a given parameter The parameter M represents the number of patterns for the neural network |
1. Function |
2. Define |
3. For Do |
(a) For Do |
i. If then |
(b) EndFor |
4. EndFor |
5. Return |
6. End Function |
Algorithm 2 The base PSO algorithm executed in one processing unit. |
1. Initialization Step. |
(a) Set , as the iteration number. |
(b) Set H the hidden nodes for the neural network. |
(c) Set m as the total number of particles. Each particle corresponds to a randomly selected set of parameters for the neural network |
(d) Set as the maximum number of iterations allowed. |
(e) Initialize velocities randomly. |
(f) For do . The vector corresponds to the best located position of particle i. |
(g) Set |
2. If , then terminate. |
3. For Do |
(a) Compute the velocity using the vectors and |
(b) Set the new position |
(c) Calculate the for particle using the Equation (6) as |
(d) If then |
4. End For |
5. Set |
6. Set . |
7. Goto Step 2 |
- The variables are numbers defined randomly in
- The constants are defined in .
- The variable is called inertia, proposed in [74].
2.4. Application of Local Optimization
- Fordo
- (a)
- Set
- (b)
- Set
- EndFor
3. Experiments
- UCI dataset repository, https://archive.ics.uci.edu/ml/index.php (accessed on 16 April 2023)
- Keel repository, https://sci2s.ugr.es/keel/datasets.php (accessed on 16 April 2023) [80].
3.1. Experimental Datasets
- Appendictis a medical dataset, found in [81].
- Australian dataset [82]. It is a dataset related to bank applications.
- Balance dataset [83], a cognitive dataset.
- Bands dataset, a dataset related to printing problems.
- Dermatology dataset [86], a medical dataset.
- Heart dataset [87], a dataset about heart diseases.
- Hayes roth dataset. [88].
- HouseVotes dataset [89].
- Liverdisorder dataset [92], a medical dataset.
- Mammographic dataset [93], a medical dataset.
- Page Blocks dataset [94], related to documents.
- Parkinsons dataset [95], a dataset related to Parkinson’s decease.
- Pima dataset [96], a medical dataset.
- Popfailures dataset [97], a dataset related to climate.
- Regions2 dataset, a medical dataset used in liver biopsy images of patients with hepatitis C [98].
- Saheart dataset [99], a medical dataset about heart disease.
- Segment dataset [100].
- Wdbc dataset [101], a dataset about breast tumors.
- Eeg datasets [17], medical datasets about EEG signals. The three distinct cases used here are named Z_F_S, ZO_NF_S and ZONF_S, respectively.
- Zoo dataset [104].
- Abalone dataset, for the predictioon of age of abalone [105].
- Airfoil dataset, a dataset provided by NASA [106].
- Baseball dataset, a dataset used to calculate the salary of baseball players.
- BK dataset [107], used for prediction of points in a basketball game.
- BL dataset, used in machine problems.
- MB dataset [107].
- Concrete dataset [108], a civil engineering dataset.
- Dee dataset, used to estimate the price of the electricity.
- Diabetes dataset, a medical dataset.
- Housing dataset [109].
- FA dataset, used to fit body fat to other measurements.
- MORTGAGE dataset, holding economic data from USA.
- PY dataset, (Pyrimidines problem) [110].
- Quake dataset, used to approximate the strength of a earthquake.
- Treasure dataset, which contains economic data information of USA.
- Wankara dataset, a weather dataset.
3.2. Experimental Results
- A simple genetic algorithm using m chromosomes, denoted by GENETIC in the experimental tables. In addition, in order to achieve a better solution, the local optimization method BFGS is applied to the best chromosome of the population when the genetic algorithm terminates.
- The Radial Basis Function (RBF) neural network [111], where the number of weights was set to 10.
- The optimization method Adam [112] as provided by the OptimLib. This library can be downloaded freely from https://github.com/kthohr/optim (accessed on 4 April 2023).
- The NEAT method (NeuroEvolution of Augmenting Topologies) [114]. The method is implemented in the EvolutionNet programming package downloaded from https://github.com/BiagioFesta/EvolutionNet (accessed on 4 April 2023).
4. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Sample Availability
References
- Bishop, C. Neural Networks for Pattern Recognition; Oxford University Press: Oxford, UK, 1995. [Google Scholar]
- Cybenko, G. Approximation by superpositions of a sigmoidal function. Math. Control. Signals Syst. 1989, 2, 303–314. [Google Scholar] [CrossRef]
- Baldi, P.; Cranmer, K.; Faucett, T.; Sadowski, P.; Whiteson, D. Parameterized neural networks for high-energy physics. Eur. Phys. J. C 2016, 76, 1–7. [Google Scholar] [CrossRef]
- Valdas, J.J.; Bonham-Carter, G. Time dependent neural network models for detecting changes of state in complex processes: Applications in earth sciences and astronomy. Neural Netw. 2006, 19, 196–207. [Google Scholar] [CrossRef]
- Carleo, G. Solving the quantum many-body problem with artificial neural networks. Science 2017, 355, 602–606. [Google Scholar] [CrossRef] [PubMed]
- Shen, L.; Wu, J.; Yang, W. Multiscale Quantum Mechanics/Molecular Mechanics Simulations with Neural Networks. J. Chem. Theory Comput. 2016, 12, 4934–4946. [Google Scholar] [CrossRef] [PubMed]
- Manzhos, S.; Dawes, R.; Carrington, T. Neural network-based approaches for building high dimensional and quantum dynamics-friendly potential energy surfaces. Int. J. Quantum Chem. 2015, 115, 1012–1020. [Google Scholar] [CrossRef]
- Wei, J.N.; Duvenaud, D.; Aspuru-Guzik, A. Neural Networks for the Prediction of Organic Chemistry Reactions. ACS Cent. Sci. 2016, 2, 725–732. [Google Scholar] [CrossRef]
- Baskin, I.I.; Winkler, D.; Tetko, I.V. A renaissance of neural networks in drug discovery. Expert Opin. Drug Discov. 2016, 11, 785–795. [Google Scholar] [CrossRef]
- Bartzatt, R. Prediction of Novel Anti-Ebola Virus Compounds Utilizing Artificial Neural Network (ANN). Chem. Fac. 2018, 49, 16–34. [Google Scholar]
- Falat, L.; Pancikova, L. Quantitative Modelling in Economics with Advanced Artificial Neural Networks. Procedia Econ. Financ. 2015, 34, 194–201. [Google Scholar] [CrossRef]
- Namazi, M.; Shokrolahi, A.; Maharluie, M.S. Detecting and ranking cash flow risk factors via artificial neural networks technique. J. Bus. Res. 2016, 69, 1801–1806. [Google Scholar] [CrossRef]
- Tkacz, G. Neural network forecasting of Canadian GDP growth. Int. J. Forecast. 2001, 17, 57–69. [Google Scholar] [CrossRef]
- Shirvany, Y.; Hayati, M.; Moradian, R. Multilayer perceptron neural networks with novel unsupervised training method for numerical solution of the partial differential equations. Appl. Soft Comput. 2009, 9, 20–29. [Google Scholar] [CrossRef]
- Malek, A.; Beidokhti, R.S. Numerical solution for high order differential equations using a hybrid neural network—Optimization method. Appl. Math. Comput. 2006, 183, 260–271. [Google Scholar] [CrossRef]
- Topuz, A. Predicting moisture content of agricultural products using artificial neural networks. Adv. Eng. 2010, 41, 464–470. [Google Scholar] [CrossRef]
- Escamilla-García, A.; Soto-Zarazúa, G.M.; Toledano-Ayala, M.; Rivas-Araiza, E.; Gastélum-Barrios, A. Applications of Artificial Neural Networks in Greenhouse Technology and Overview for Smart Agriculture Development. Appl. Sci. 2020, 10, 3835. [Google Scholar] [CrossRef]
- Boughrara, H.; Chtourou, M.; Amar, C.B.; Chen, L. Facial expression recognition based on a mlp neural network using constructive training algorithm. Multimed Tools Appl. 2016, 75, 709–731. [Google Scholar] [CrossRef]
- Liu, H.; Tian, H.Q.; Li, Y.F.; Zhang, L. Comparison of four Adaboost algorithm based artificial neural networks in wind speed predictions. Energy Convers. Manag. 2015, 92, 67–81. [Google Scholar] [CrossRef]
- Szoplik, J. Forecasting of natural gas consumption with artificial neural networks. Energy 2015, 85, 208–220. [Google Scholar] [CrossRef]
- Bahram, H.; Navimipour, N.J. Intrusion detection for cloud computing using neural networks and artificial bee colony optimization algorithm. ICT Express 2019, 5, 56–59. [Google Scholar]
- Rumelhart, D.E.; Hinton, G.E.; Williams, R.J. Learning representations by back-propagating errors. Nature 1986, 323, 533–536. [Google Scholar] [CrossRef]
- Chen, T.; Zhong, S. Privacy-Preserving Backpropagation Neural Network Learning. IEEE Trans. Neural Netw. 2009, 20, 1554–1564. [Google Scholar] [CrossRef] [PubMed]
- Riedmiller, M.; Braun, H. A Direct Adaptive Method for Faster Backpropagation Learning: The RPROP algorithm. In Proceedings of the IEEE International Conference on Neural Networks, San Francisco, CA, USA, 28 March–1 April 1993; pp. 586–591. [Google Scholar]
- Pajchrowski, T.; Zawirski, K.; Nowopolski, K. Neural Speed Controller Trained Online by Means of Modified RPROP Algorithm. IEEE Trans. Ind. Inform. 2015, 11, 560–568. [Google Scholar] [CrossRef]
- Hermanto, R.P.S.; Nugroho, A. Waiting-Time Estimation in Bank Customer Queues using RPROP Neural Networks. Procedia Comput. Sci. 2018, 135, 35–42. [Google Scholar] [CrossRef]
- Robitaille, B.; Marcos, B.; Veillette, M.; Payre, G. Modified quasi-Newton methods for training neural networks. Comput. Chem. Eng. 1996, 20, 1133–1140. [Google Scholar] [CrossRef]
- Liu, Q.; Liu, J.; Sang, R.; Li, J.; Zhang, T.; Zhang, Q. Fast Neural Network Training on FPGA Using Quasi-Newton Optimization Method. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 2018, 26, 1575–1579. [Google Scholar] [CrossRef]
- Yamazaki, A.; de Souto, M.C.P. Optimization of neural network weights and architectures for odor recognition using simulated annealing. In Proceedings of the 2002 International Joint Conference on Neural Networks, IJCNN’02 1, Honolulu, HI, USA, 12–17 May 2002; pp. 547–552. [Google Scholar]
- Da, Y.; Xiurun, G. An improved PSO-based ANN with simulated annealing technique. Neurocomputing 2005, 63, 527–533. [Google Scholar] [CrossRef]
- Leung, F.H.F.; Lam, H.K.; Ling, S.H.; Tam, P.K.S. Tuning of the structure and parameters of a neural network using an improved genetic algorithm. IEEE Trans. Neural Netw. 2003, 14, 79–88. [Google Scholar] [CrossRef]
- Yao, X. Evolving artificial neural networks. Proc. IEEE 1999, 87, 1423–1447. [Google Scholar]
- Zhang, C.; Shao, H.; Li, Y. Particle swarm optimisation for evolving artificial neural network. In Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, Nashville, TN, USA, 8–11 October 2000; pp. 2487–2490. [Google Scholar]
- Yu, J.; Wang, S.; Xi, L. Evolving artificial neural networks using an improved PSO and DPSO. Neurocomputing 2008, 71, 1054–1060. [Google Scholar] [CrossRef]
- Ilonen, J.; Kamarainen, J.K.; Lampinen, J. Differential Evolution Training Algorithm for Feed-Forward Neural Networks. Neural Process. Lett. 2003, 17, 93–105. [Google Scholar] [CrossRef]
- Rocha, M.; Cortez, P.; Neves, J. Evolution of neural networks for classification and regression. Neurocomputing 2007, 70, 2809–2816. [Google Scholar] [CrossRef]
- Aljarah, I.; Faris, H.; Mirjalili, S. Optimizing connection weights in neural networks using the whale optimization algorithm. Soft Comput. 2018, 22, 1–15. [Google Scholar] [CrossRef]
- Jalali, S.M.J.; Ahmadian, S.; Kebria, P.M.; Khosravi, A.; Lim, C.P.; Nahavandi, S. Evolving Artificial Neural Networks Using Butterfly Optimization Algorithm for Data Classification. In Neural Information Processing. ICONIP 2019; Lecture Notes in Computer Science; Gedeon, T., Wong, K., Lee, M., Eds.; Springer: Cham, Switzerland, 2019; Volume 11953. [Google Scholar]
- Ivanova, I.; Kubat, M. Initialization of neural networks by means of decision trees. Knowl. Based Syst. 1995, 8, 333–344. [Google Scholar] [CrossRef]
- Yam, J.Y.F.; Chow, T.W.S. A weight initialization method for improving training speed in feedforward neural network. Neurocomputing 2000, 30, 219–232. [Google Scholar] [CrossRef]
- Itano, F.; de Abreu de Sousa, M.A.; Del-Moral-Hernandez, E. Extending MLP ANN hyper-parameters Optimization by using Genetic Algorithm. In Proceedings of the 2018 International Joint Conference on Neural Networks (IJCNN), Rio de Janeiro, Brazil, 8–13 July 2018; Volume 2018, pp. 1–8. [Google Scholar]
- Chumachenko, K.; Iosifidis, A.; Gabbouj, M. Feedforward neural networks initialization based on discriminant learning. Neural Netw. 2022, 146, 220–229. [Google Scholar] [CrossRef] [PubMed]
- Setiono, R. Feedforward Neural Network Construction Using Cross Validation. Neural Comput. 2001, 13, 2865–2877. [Google Scholar] [CrossRef]
- O’Neill, M.; Ryan, C. Grammatical evolution. IEEE Trans. Evol. Comput. 2001, 5, 349–358. [Google Scholar] [CrossRef]
- Tsoulos, I.G.; Gavrilis, D.; Glavas, E. Neural network construction and training using grammatical evolution. Neurocomputing 2008, 72, 269–277. [Google Scholar] [CrossRef]
- Kim, K.J.; Cho, S.B. Evolved neural networks based on cellular automata for sensory-motor controller. Neurocomputing 2006, 69, 2193–2207. [Google Scholar] [CrossRef]
- Martínez-Zarzuela, M.; Pernas, F.J.D.; Higuera, J.F.D.; Rodríguez, M.A. Fuzzy ART Neural Network Parallel Computing on the GPU. In Computational and Ambient Intelligence. IWANN 2007; Lecture Notes in Computer Science; Sandoval, F., Prieto, A., Cabestany, J., Graña, M., Eds.; Springer: Berlin/Heidelberg, Germany, 2007; Volume 4507. [Google Scholar]
- Huqqani, A.A.; Schikuta, E.; Chen, S.Y.P. Multicore and GPU Parallelization of Neural Networks for Face Recognition. Procedia Comput. Sci. 2013, 18, 349–358. [Google Scholar] [CrossRef]
- Nowlan, S.J.; Hinton, G.E. Simplifying neural networks by soft weight sharing. Neural Comput. 1992, 4, 473–493. [Google Scholar] [CrossRef]
- Kim, J.K.; Lee, M.Y.; Kim, J.Y.; Kim, B.J.; Lee, J.H. An efficient pruning and weight sharing method for neural network. In Proceedings of the 2016 IEEE International Conference on Consumer Electronics-Asia (ICCE-Asia), Seoul, Republic of Korea, 26–28 October 2016; pp. 1–2. [Google Scholar]
- Hanson, S.J.; Pratt, L.Y. Comparing biases for minimal network construction with back propagation, In Advances in Neural Information Processing Systems; Touretzky, D.S., Ed.; Morgan Kaufmann: San Mateo, CA, USA, 1989; Volume 1, pp. 177–185. [Google Scholar]
- Mozer, M.C.; Smolensky, P. Skeletonization: A technique for trimming the fat from a network via relevance assesment. In Advances in Neural Processing Systems; Touretzky, D.S., Ed.; Morgan Kaufmann: San Mateo CA, USA, 1989; Volume 1, pp. 107–115. [Google Scholar]
- Augasta, M.; Kathirvalavakumar, T. Pruning algorithms of neural networks—A comparative study. Cent. Eur. Comput. Sci. 2003, 3, 105–115. [Google Scholar] [CrossRef]
- Srivastava, N.; Hinton, G.; Krizhevsky, A.; Sutskever, I.; Salakhutdinov, R. Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 2014, 15, 1929–1958. [Google Scholar]
- Iosifidis, A.; Tefas, A.; Pitas, I. DropELM: Fast neural network regularization with Dropout and DropConnect. Neurocomputing 2015, 162, 57–66. [Google Scholar] [CrossRef]
- Gupta, A.; Lam, S.M. Weight decay backpropagation for noisy data. Neural Netw. 1998, 11, 1127–1138. [Google Scholar] [CrossRef] [PubMed]
- Carvalho, M.; Ludermir, T.B. Particle Swarm Optimization of Feed-Forward Neural Networks with Weight Decay. In Proceedings of the 2006 Sixth International Conference on Hybrid Intelligent Systems (HIS’06), Rio de Janeiro, Brazil, 13–15 December 2006; p. 5. [Google Scholar]
- Treadgold, N.K.; Gedeon, T.D. Simulated annealing and weight decay in adaptive learning: The SARPROP algorithm. IEEE Trans. Neural Netw. 1998, 9, 662–668. [Google Scholar] [CrossRef]
- Shahjahan, M.D.; Kazuyuki, M. Neural network training algorithm with possitive correlation. IEEE Trans. Inf. Syst. 2005, 88, 2399–2409. [Google Scholar] [CrossRef]
- Geman, S.; Bienenstock, E.; Doursat, R. Neural networks and the bias/variance dilemma. Neural Comput. 1992, 4, 1–58. [Google Scholar] [CrossRef]
- Hawkins, D.M. The Problem of Overfitting. J. Chem. Inf. Comput. Sci. 2004, 44, 1–12. [Google Scholar] [CrossRef]
- Marini, F.; Walczak, B. Particle swarm optimization (PSO). A tutorial. Chemom. Intell. Lab. Syst. 2015, 149, 153–165. [Google Scholar] [CrossRef]
- de Moura Meneses, A.A.; Machado, M.D.; Schirru, M.R. Particle Swarm Optimization applied to the nuclear reload problem of a Pressurized Water Reactor. Progress Nucl. Energy 2009, 51, 319–326. [Google Scholar] [CrossRef]
- Shaw, R.; Srivastava, S. Particle swarm optimization: A new tool to invert geophysical data. Geophysics 2007, 72, F75–F83. [Google Scholar] [CrossRef]
- Ourique, C.O.; Biscaia, E.C.; Pinto, J.C. The use of particle swarm optimization for dynamical analysis in chemical processes. Comput. Chem. Eng. 2002, 26, 1783–1793. [Google Scholar] [CrossRef]
- Fang, H.; Zhou, J.; Wang, Z.; Qiu, Z.; Sun, Y.; Lin, Y.; Chen, K.; Zhou, X.; Pan, M. Hybrid method integrating machine learning and particle swarm optimization for smart chemical process operations. Front. Chem. Sci. Eng. 2022, 16, 274–287. [Google Scholar] [CrossRef]
- Wachowiak, M.P.; Smolikova, R.; Zheng, Y.; Zurada, J.M.; Elmaghraby, A.S. An approach to multimodal biomedical image registration utilizing particle swarm optimization. IEEE Trans. Evol. Comput. 2004, 8, 289–301. [Google Scholar] [CrossRef]
- Marinaki, Y.M.M.; Dounias, G. Particle swarm optimization for pap-smear diagnosis. Expert Syst. Appl. 2008, 35, 1645–1656. [Google Scholar] [CrossRef]
- Park, J.-B.; Jeong, Y.-W.; Shin, J.-R.; Lee, K.Y. An Improved Particle Swarm Optimization for Nonconvex Economic Dispatch Problems. IEEE Trans. Power Syst. 2010, 25, 156–162. [Google Scholar] [CrossRef]
- Hwang, J.T.G.; Ding, A.A. Prediction Intervals for Artificial Neural Networks. J. Am. Stat. 1997, 92, 748–757. [Google Scholar] [CrossRef]
- Kasiviswanathan, K.S.; Cibin, R.; Sudheer, K.P.; Chaubey, I. Constructing prediction interval for artificial neural network rainfall runoff models based on ensemble simulations. J. Hydrol. 2013, 499, 275–288. [Google Scholar] [CrossRef]
- Sodhi, S.S.; Chandra, P. Interval based Weight Initialization Method for Sigmoidal Feedforward Artificial Neural Networks. AASRI Procedia 2014, 6, 19–25. [Google Scholar] [CrossRef]
- Narkhede, M.V.; Bartakke, P.P.; Sutaone, M.S. A review on weight initialization strategies for neural networks. Artif. Intell. Rev. 2022, 55, 291–322. [Google Scholar] [CrossRef]
- Kennedy, J.; Eberhart, R. Particle swarm optimization. In Proceedings of the ICNN’95—International Conference on Neural Networks, Perth, Australia, 27 November–1 December 1995; Volume 4, pp. 1942–1948. [Google Scholar] [CrossRef]
- Eberhart, R.C.; Shi, Y.H. Tracking and optimizing dynamic systems with particle swarms. In Proceedings of the 2001 Congress on Evolutionary Computation (IEEE Cat. No.01TH8546), Seoul, Republic of Korea, 27–30 May 2001. [Google Scholar]
- Shi, Y.H.; Eberhart, R.C. Empirical study of particle swarm optimization. In Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406), Washington, DC, USA, 6–9 July 1999. [Google Scholar]
- Shi, Y.H.; Eberhart, R.C. Experimental study of particle swarm optimization. In Proceedings of the SCI2000 Conference, Orlando, FL, USA, 30 April–2 May 2000. [Google Scholar]
- Powell, M.J.D. A Tolerant Algorithm for Linearly Constrained Optimization Calculations. Math. Program. 1989, 45, 547–566. [Google Scholar] [CrossRef]
- Fletcher, R. A new approach to variable metric algorithms. Comput. J. 1970, 13, 317–322. [Google Scholar] [CrossRef]
- Alcalá-Fdez, J.; Fernandez, A.; Luengo, J.; Derrac, J.; García, S.; Sánchez, L.; Herrera, F. KEEL Data-Mining Software Tool: Data Set Repository, Integration of Algorithms and Experimental Analysis Framework. J. Mult. Valued Log. Soft Comput. 2011, 17, 255–287. [Google Scholar]
- Weiss, S.M.; Kulikowski, C.A. Computer Systems That Learn: Classification and Prediction Methods from Statistics, Neural Nets, Machine Learning, and Expert Systems; Morgan Kaufmann Publishers Inc.: San Mateo CA, USA, 1991. [Google Scholar]
- Quinlan, J.R. Simplifying Decision Trees. Int. Man Mach. Stud. 1987, 27, 221–234. [Google Scholar] [CrossRef]
- Shultz, T.; Mareschal, D. Schmidt, Modeling Cognitive Development on Balance Scale Phenomena. Mach. Learn. 1994, 16, 59–88. [Google Scholar] [CrossRef]
- Zhou, Z.H. NeC4.5: Neural ensemble based C4.5. IEEE Trans. Knowl. Data Eng. 2004, 16, 770–773. [Google Scholar] [CrossRef]
- Setiono, R.; Leow, W.K. FERNN: An Algorithm for Fast Extraction of Rules from Neural Networks. Appl. Intell. 2000, 12, 15–25. [Google Scholar] [CrossRef]
- Demiroz, G.; Govenir, H.A.; Ilter, N. Learning Differential Diagnosis of Eryhemato-Squamous Diseases using Voting Feature Intervals. Artif. Intell. Med. 1998, 13, 147–165. [Google Scholar]
- Kononenko, I.; Šimec, E.; Robnik-Šikonja, M. Overcoming the Myopia of Inductive Learning Algorithms with RELIEFF. Appl. Intell. 1997, 7, 39–55. [Google Scholar] [CrossRef]
- Hayes-Roth, B.; Hayes-Roth, F. Concept learning and the recognition and classification of exemplars. J. Verbal Learning Verbal Behav. 1977, 16, 321–338. [Google Scholar] [CrossRef]
- French, R.M.; Chater, N. Using noise to compute error surfaces in connectionist networks: A novel means of reducing catastrophic forgetting. Neural Comput. 2002, 14, 1755–1769. [Google Scholar] [CrossRef]
- Dy, J.G.; Brodley, C.E. Feature Selection for Unsupervised Learning. J. Mach. Learn. Res. 2004, 5, 45–889. [Google Scholar]
- Perantonis, S.J.; Virvilis, V. Input Feature Extraction for Multilayered Perceptrons Using Supervised Principal Component Analysis. Neural Process. Lett. 1999, 10, 243–252. [Google Scholar] [CrossRef]
- Garcke, J.; Griebel, M. Classification with sparse grids using simplicial basis functions. Intell. Data Anal. 2002, 6, 483–502. [Google Scholar] [CrossRef]
- Elter, M.; Schulz-Wendtland, R.; Wittenberg, T. The prediction of breast cancer biopsy outcomes using two CAD approaches that both emphasize an intelligible decision process. Med. Phys. 2007, 34, 4164–4172. [Google Scholar] [CrossRef]
- Esposito, F.F.; Malerba, D.; Semeraro, G. Multistrategy Learning for Document Recognition. Appl. Artif. Intell. 1994, 8, 33–84. [Google Scholar] [CrossRef]
- Little, M.A.; McSharry, P.E.; Hunter, E.J.; Spielman, J.; Ramig, L.O. Suitability of dysphonia measurements for telemonitoring of Parkinson’s disease. IEEE Trans. Biomed. Eng. 2009, 56, 1015–1022. [Google Scholar] [CrossRef] [PubMed]
- Smith, J.W.; Everhart, J.E.; Dickson, W.C.; Knowler, W.C.; Johannes, R.S. Using the ADAP learning algorithm to forecast the onset of diabetes mellitus. In Proceedings of the Symposium on Computer Applications and Medical Care IEEE Computer Society Press, Minneapolis, MN, USA, 8–10 June 1988; pp. 261–265. [Google Scholar]
- Lucas, D.D.; Klein, R.; Tannahill, J.; Ivanova, D.; Brandon, S.; Domyancic, D.; Zhang, Y. Failure analysis of parameter-induced simulation crashes in climate models. Geosci. Model Dev. 2013, 6, 1157–1171. [Google Scholar] [CrossRef]
- Giannakeas, N.; Tsipouras, M.G.; Tzallas, A.T.; Kyriakidi, K.; Tsianou, Z.E.; Manousou, P.; Hall, A.; Karvounis, E.C.; Tsianos, V.; Tsianos, E. A clustering based method for collagen proportional area extraction in liver biopsy images. In Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS, Milan, Italy, 25–29 August 2015; pp. 3097–3100. [Google Scholar]
- Hastie, T.; Tibshirani, R. Non-parametric logistic and proportional odds regression. JRSS-C (Appl. Stat.) 1987, 36, 260–276. [Google Scholar] [CrossRef]
- Dash, M.; Liu, H.; Scheuermann, P.; Tan, K.L. Fast hierarchical clustering and its validation. Data Knowl. Eng. 2003, 44, 109–138. [Google Scholar] [CrossRef]
- Wolberg, W.H.; Mangasarian, O.L. Multisurface method of pattern separation for medical diagnosis applied to breast cytology. Proc. Natl. Acad. Sci. USA 1990, 87, 9193–9196. [Google Scholar] [CrossRef] [PubMed]
- Raymer, M.; Doom, T.E.; Kuhn, L.A.; Punch, W.F. Knowledge discovery in medical and biological datasets using a hybrid Bayes classifier/evolutionary algorithm. IEEE Trans. Syst. Man Cybern. Part B 2003, 33, 802–813. [Google Scholar] [CrossRef]
- Zhong, P.; Fukushima, M. Regularized nonsmooth Newton method for multi-class support vector machines. Optim. Methods Softw. 2007, 22, 225–236. [Google Scholar] [CrossRef]
- Koivisto, M.; Sood, K. Exact Bayesian Structure Discovery in Bayesian Networks. J. Mach. Learn. Res. 2004, 5, 549–573. [Google Scholar]
- Nash, W.J.; Sellers, T.L.; Talbot, S.R.; Cawthor, A.J.; Ford, W.B. The Population Biology of Abalone (_Haliotis_ species) in Tasmania. I. Blacklip Abalone (_H. rubra_) from the North Coast and Islands of Bass Strait, Sea Fisheries Division; Technical Report No. 48 (ISSN 1034-3288); Department of Primary Industry and Fisheries, Tasmania: Hobart, Australia, 1994. [Google Scholar]
- Brooks, T.F.; Pope, D.S.; Marcolini, A.M. Airfoil Self-Noise and Prediction; Technical Report, NASA RP-1218; National Aeronautics and Space Administration: Washington, DC, USA, 1989. [Google Scholar]
- Simonoff, J.S. Smooting Methods in Statistics; Springer: Berlin/Heidelberg, Germany, 1996. [Google Scholar]
- Yeh, I.C. Modeling of strength of high performance concrete using artificial neural networks. Cem. Concr. Res. 1998, 28, 1797–1808. [Google Scholar] [CrossRef]
- Harrison, D.; Rubinfeld, D.L. Hedonic prices and the demand for clean ai. J. Environ. Econ. Manag. 1978, 5, 81–102. [Google Scholar] [CrossRef]
- King, R.D.; Muggleton, S.; Lewis, R.A.; Sternberg, M.J. Drug design by machine learning: The use of inductive logic programming to model the structure-activity relationships of trimethoprim analogues binding to dihydrofolate reductase. Proc. Nat. Acad. Sci. USA 1992, 89, 11322–11326. [Google Scholar] [CrossRef] [PubMed]
- Park, J.; Sandberg, I.W. Universal Approximation Using Radial-Basis-Function Networks. Neural Comput. 1991, 3, 246–257. [Google Scholar] [CrossRef]
- Kingma, D.P.; Ba, J.L. ADAM: A method for stochastic optimization. In Proceedings of the 3rd International Conference on Learning Representations (ICLR 2015), San Diego, CA, USA, 7–9 May 2015; pp. 1–15. [Google Scholar]
- Klima, G. Fast Compressed Neural Networks. Available online: http://fcnn.sourceforge.net/ (accessed on 14 April 2023).
- Stanley, K.O.; Miikkulainen, R. Evolving Neural Networks through Augmenting Topologies. Evol. Comput. 2002, 10, 99–127. [Google Scholar] [CrossRef] [PubMed]
Parameter | Meaning | Value |
---|---|---|
m | Number of particles or chromosomes | 200 |
Maximum number of iterations | 200 | |
Minimum value for inertia | 0.4 | |
Maximum value for inertia | 0.9 | |
H | Number of weights | 10 |
Penalty factor | 100.0 | |
L | The limit for the function | 10.0 |
F | The margin factor | 5.0 |
Dataset | Nclass | Genetic | RBF | ADAM | RPROP | NEAT | Proposed |
---|---|---|---|---|---|---|---|
Appendicitis | 2 | 18.10% | 12.23% | 16.50% | 16.30% | 17.20% | 16.97% |
Australian | 2 | 32.21% | 34.89% | 35.65% | 36.12% | 31.98% | 26.96% |
Balance | 3 | 8.97% | 33.42% | 7.87% | 8.81% | 23.14% | 7.52% |
Bands | 2 | 35.75% | 37.22% | 36.25% | 36.32% | 34.30% | 35.77% |
Cleveland | 5 | 51.60% | 67.10% | 67.55% | 61.41% | 53.44% | 48.40% |
Dermatology | 6 | 30.58% | 62.34% | 26.14% | 15.12% | 32.43% | 14.30% |
Hayes Roth | 3 | 56.18% | 64.36% | 59.70% | 37.46% | 50.15% | 36.33% |
Heart | 2 | 28.34% | 31.20% | 38.53% | 30.51% | 39.27% | 18.99% |
HouseVotes | 2 | 6.62% | 6.13% | 7.48% | 6.04% | 10.89% | 7.10% |
Ionosphere | 2 | 15.14% | 16.22% | 16.64% | 13.65% | 19.67% | 13.15% |
Liverdisorder | 2 | 31.11% | 30.84% | 41.53% | 40.26% | 30.67% | 32.07% |
Lymography | 4 | 23.26% | 25.31% | 29.26% | 24.67% | 33.70% | 27.05% |
Mammographic | 2 | 19.88% | 21.38% | 46.25% | 18.46% | 22.85% | 17.37% |
PageBlocks | 5 | 8.06% | 10.09% | 7.93% | 7.82% | 10.22% | 6.47% |
Parkinsons | 2 | 18.05% | 17.42% | 24.06% | 22.28% | 18.56% | 14.60% |
Pima | 2 | 32.19% | 25.78% | 34.85% | 34.27% | 34.51% | 26.34% |
Popfailures | 2 | 5.94% | 7.04% | 5.18% | 4.81% | 7.05% | 5.27% |
Regions2 | 5 | 29.39% | 38.29% | 29.85% | 27.53% | 33.23% | 26.29% |
Saheart | 2 | 34.86% | 32.19% | 34.04% | 34.90% | 34.51% | 32.49% |
Segment | 7 | 57.72% | 59.68% | 49.75% | 52.14% | 66.72% | 18.99% |
Wdbc | 2 | 8.56% | 7.27% | 35.35% | 21.57% | 12.88% | 6.01% |
Wine | 3 | 19.20% | 31.41% | 29.40% | 30.73% | 25.43% | 10.92% |
Z_F_S | 3 | 10.73% | 13.16% | 47.81% | 29.28% | 38.41% | 8.55% |
ZO_NF_S | 3 | 8.41% | 9.02% | 47.43% | 6.43% | 43.75% | 7.11% |
ZONF_S | 2 | 2.60% | 4.03% | 11.99% | 27.27% | 5.44% | 2.61% |
ZOO | 7 | 16.67% | 21.93% | 14.13% | 15.47% | 20.27% | 5.80% |
AVERAGE | 23.47% | 27.69% | 30.81% | 25.37% | 28.87% | 18.21% |
Dataset | Range | Genetic | RBF | ADAM | RPROP | NEAT | Proposed |
---|---|---|---|---|---|---|---|
ABALONE | 19.0 | 7.17 | 7.37 | 4.30 | 4.55 | 9.88 | 4.34 |
AIRFOIL | 0.41 | 0.003 | 0.27 | 0.005 | 0.002 | 0.067 | 0.002 |
BASEBALL | 42.50 | 103.60 | 93.02 | 77.90 | 92.05 | 100.39 | 58.78 |
BK | 0.32 | 0.027 | 0.02 | 0.03 | 1.599 | 0.15 | 0.03 |
BL | 0.68 | 5.74 | 0.01 | 0.28 | 4.38 | 0.05 | 0.02 |
CONCRETE | 0.77 | 0.0099 | 0.011 | 0.078 | 0.0086 | 0.081 | 0.003 |
DEE | 4.12 | 1.013 | 0.17 | 0.63 | 0.608 | 1.512 | 0.23 |
DIABETES | 0.90 | 19.86 | 0.49 | 3.03 | 1.11 | 4.25 | 0.65 |
HOUSING | 43.0 | 43.26 | 57.68 | 80.20 | 74.38 | 56.49 | 21.85 |
FA | 0.51 | 1.95 | 0.02 | 0.11 | 0.14 | 0.19 | 0.02 |
MB | 0.10 | 3.39 | 2.16 | 0.06 | 0.055 | 0.061 | 0.051 |
MORTGAGE | 12.74 | 2.41 | 1.45 | 9.24 | 9.19 | 14.11 | 0.31 |
PY | 0.84 | 105.41 | 0.02 | 0.09 | 0.039 | 0.075 | 0.08 |
QUAKE | 0.60 | 0.04 | 0.071 | 0.06 | 0.041 | 0.298 | 0.044 |
TREASURY | 15.54 | 2.929 | 2.02 | 11.16 | 10.88 | 15.52 | 0.34 |
WANKARA | 0.605 | 0.012 | 0.001 | 0.02 | 0.0003 | 0.005 | 0.0002 |
AVERAGE | 18.55 | 10.30 | 11.70 | 12.44 | 12.70 | 5.42 |
Precision | Recall | |||
---|---|---|---|---|
Dataset | Genetic | Proposed | Genetic | Proposed |
PARKINSONS | 0.77 | 0.82 | 0.68 | 0.77 |
WINE | 0.75 | 0.90 | 0.79 | 0.89 |
HEART | 0.73 | 0.81 | 0.72 | 0.80 |
Dataset | ADAM | PROPOSED_BFGS | PROPOSED_ADAM |
---|---|---|---|
Appendicitis | 16.50% | 16.97% | 16.03% |
Australian | 35.65% | 26.96% | 32.45% |
Balance | 7.87% | 7.52% | 7.59% |
Bands | 36.25% | 35.77% | 35.72% |
Cleveland | 67.55% | 48.40% | 47.62% |
Dermatology | 26.14% | 14.30% | 19.78% |
Hayes Roth | 59.70% | 36.33% | 36.87% |
Heart | 38.53% | 18.99% | 18.86% |
HouseVotes | 7.48% | 7.10% | 4.20% |
Ionosphere | 16.64% | 13.15% | 9.32% |
Liverdisorder | 41.53% | 32.07% | 32.24% |
Lymography | 29.26% | 27.05% | 27.64% |
Mammographic | 46.25% | 17.37% | 21.38% |
PageBlocks | 7.93% | 6.47% | 7.33% |
Parkinsons | 24.06% | 14.60% | 16.77% |
Pima | 34.85% | 26.34% | 30.19% |
Popfailures | 5.18% | 5.27% | 4.56% |
Regions2 | 29.85% | 26.29% | 26.14% |
Saheart | 34.04% | 32.49% | 32.55% |
Segment | 49.75% | 18.99% | 37.51% |
Wdbc | 35.35% | 6.01% | 7.40% |
Wine | 29.40% | 10.92% | 12.80% |
Z_F_S | 47.81% | 8.55% | 9.76% |
ZO_NF_S | 47.43% | 7.11% | 8.87% |
ZONF_S | 11.99% | 2.61% | 2.68% |
ZOO | 14.13% | 5.80% | 6.47% |
AVERAGE | 30.81% | 18.21% | 19.72% |
Dataset | ADAM | PROPOSED_BFGS | PROPOSED_ADAM |
---|---|---|---|
ABALONE | 4.30 | 4.34 | 4.49 |
AIRFOIL | 0.005 | 0.002 | 0.003 |
BASEBALL | 77.90 | 58.78 | 72.43 |
BK | 0.03 | 0.03 | 0.02 |
BL | 0.28 | 0.02 | 0.01 |
CONCRETE | 0.078 | 0.003 | 0.004 |
DEE | 0.63 | 0.23 | 0.25 |
DIABETES | 3.03 | 0.65 | 0.44 |
HOUSING | 80.20 | 21.85 | 34.22 |
FA | 0.11 | 0.02 | 0.02 |
MB | 0.06 | 0.051 | 0.047 |
MORTGAGE | 9.24 | 0.31 | 1.83 |
PY | 0.09 | 0.08 | 0.02 |
QUAKE | 0.06 | 0.044 | 0.039 |
TREASURY | 11.16 | 0.34 | 2.36 |
WANKARA | 0.02 | 0.0002 | 0.0002 |
AVERAGE | 11.70 | 5.42 | 7.26 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Tsoulos, I.G.; Tzallas, A.; Karvounis, E.; Tsalikakis, D. Bound the Parameters of Neural Networks Using Particle Swarm Optimization. Computers 2023, 12, 82. https://doi.org/10.3390/computers12040082
Tsoulos IG, Tzallas A, Karvounis E, Tsalikakis D. Bound the Parameters of Neural Networks Using Particle Swarm Optimization. Computers. 2023; 12(4):82. https://doi.org/10.3390/computers12040082
Chicago/Turabian StyleTsoulos, Ioannis G., Alexandros Tzallas, Evangelos Karvounis, and Dimitrios Tsalikakis. 2023. "Bound the Parameters of Neural Networks Using Particle Swarm Optimization" Computers 12, no. 4: 82. https://doi.org/10.3390/computers12040082
APA StyleTsoulos, I. G., Tzallas, A., Karvounis, E., & Tsalikakis, D. (2023). Bound the Parameters of Neural Networks Using Particle Swarm Optimization. Computers, 12(4), 82. https://doi.org/10.3390/computers12040082