Constructing the Bounds for Neural Network Training Using Grammatical Evolution
Abstract
:1. Introduction
2. Method Description
2.1. Grammatical Evolution
- N is the set of the non-terminal symbols. Every symbol in N has a series of production rules, used to produce terminal symbols.
- T is the set of terminal symbols.
- S denotes the start symbol of the grammar, with .
- P is the set of production rules, to create terminal symbols non-terminal symbols. These rules are in the form or .
- Obtain the next element V from the current chromosome.
- Select the next production rule according to: Rule = V mod , where is the total number of production rules for the current non-terminal symbol.
- 1.
- NOTHING. This command means that no action takes place.
- 2.
- EXPAND. With this command, the corresponding end of the value field is extended by 50% of the width of the field.
- 3.
- DIVIDE. With this command, the corresponding end of the value field is shrunk by 50% of the width of the field.
2.2. The First Phase of the Proposed Method
- 1.
- Set as the number of chromosomes for the grammatical evolution.
- 2.
- Set H as the number of weights for the neural network.
- 3.
- Set as the maximum number of allowed generations.
- 4.
- Set as the selection rate, with .
- 5.
- Set as the mutation rate, with .
- 6.
- Set as the number of randomly created neural networks, which will be used in the fitness calculation.
- 7.
- Initialize randomly the chromosomes. Every chromosome is a set of integer numbers used to produce valid programs through grammatical evolution and the associated grammar of Figure 1.
- 8.
- Set , the best discovered fitness. For this algorithm, we consider the fitness function of any given chromosome g as an interval
- 9.
- Set iter=0.
- 10.
- For do
- (a)
- Create for the chromosome the corresponding program using the grammar of Figure 1.
- (b)
- Apply the program to in order to produce the bounds .
- (c)
- Set
- (d)
- For do
- i.
- Create randomly as a set for the parameters of neural network.
- ii.
- Calculate the associated training error
- iii.
- If then
- iv.
- If then
- (e)
- EndFor
- (f)
- Set as the fitness value for the chromosome .
- 11.
- EndFor
- 12.
- Apply the selection procedure. Firstly, the chromosomes are sorted with correspondence to their fitness values. Since fitness is considered an interval, a fitness comparison function is required. For this reason, the operator is used to compare two fitness values and as follows:In practice this means that the fitness value is considered smaller than if . The first chromosomes with the lowest fitness values are copied to the next generation. The remaining chromosomes are substituted by chromosomes produced by the crossover procedure. During the selection process, for every new offspring, two chromosomes are selected as parents from the population using the well-known procedure of tournament selection.
- 13.
- Apply the crossover procedure. For each pair of parents, two new chromosomes and are created using the one-point crossover, graphically shown in Figure 2.
- 14.
- Apply the mutation procedure. For each element of every chromosome alter the corresponding element with probability .
- 15.
- Set iter=iter+1
- 16.
- If goto step 10.
2.3. The Second Phase of the Proposed Method
- 1.
- Initialization Step
- (a)
- Set as the number of chromosomes that participate in the genetic algorithm.
- (b)
- Set as the maximum number of allowed iterations.
- (c)
- Set H as the number of weights for the neural network.
- (d)
- Obtain the best interval S from the previous step of Section 2.2.
- (e)
- Initialize using uniform distribution the chromosomes in S.
- (f)
- Set as the selection rate, with .
- (g)
- Set as the mutation rate, with .
- (h)
- Set iter=0.
- 2.
- Fitness calculation Step
- (a)
- For do
- i.
- Calculate the fitness of chromosome as
- (b)
- EndFor
- 3.
- Genetic operations step
- (a)
- Selection procedure: Initially, the chromosomes are sorted according to their fitness values. The first chromosomes with the lowest fitness values are copied to the next generation. The remaining chromosomes are substituted by chromosomes produced by the crossover procedure. During the selection process, for every new offspring, two chromosomes are selected as parents from the population using the well-known procedure of tournament selection.
- (b)
- Crossover procedure: For each pair of selected parents, two chromosomes and are constructed using the following equations:
- (c)
- Mutation procedure: For each element of every chromosome, alter the corresponding element with probability .
- 4.
- Termination Check Step
- (a)
- Set
- (b)
- If go to step 2, otherwise apply a local search procedure to the best chromosome of the population. In the current work, the BFGS variant of Powell [56] was used.
3. Experiments
- 1.
- UCI dataset repository, https://archive.ics.uci.edu/ml/index.php [57] (accessed on 4 November 2023).
- 2.
- Keel repository, https://sci2s.ugr.es/keel/datasets.php [58] (accessed on 4 November 2023).
- 3.
- The Statlib URL ftp://lib.stat.cmu.edu/datasets/index.html (accessed on 4 November 2023).
3.1. Experimental Datasets
- 1.
- Appendicitis dataset, a medical purpose dataset, suggested in [59].
- 2.
- Australian dataset [60], a dataset related to credit card transactions.
- 3.
- Balance dataset [61], related to psychological states.
- 4.
- 5.
- Dermatology dataset [64], a medical dataset related to erythemato-squamous diseases.
- 6.
- Heart dataset [65], a medical dataset related to heart diseases.
- 7.
- Hayes roth dataset [66].
- 8.
- HouseVotes dataset [67], related to votes in the U.S. House of Representatives Congressmen.
- 9.
- 10.
- Liverdisorder dataset [70], a medical dataset related to liver disorders.
- 11.
- Mammographic dataset [71], used to identify breast tumors.
- 12.
- Parkinsons dataset, a medical dataset related to Parkinson’s disease (PD) [72].
- 13.
- Pima dataset [73], used to detect the presence of diabetes.
- 14.
- Popfailures dataset [74], a dataset related to climate measurements.
- 15.
- Regions2 dataset, medical dataset related to hepatitis C [75].
- 16.
- Saheart dataset [76], a medical dataset related to heart diseases.
- 17.
- Segment dataset [77], an image processing dataset.
- 18.
- Wdbc dataset [78], a medical dataset related to breast tumors.
- 19.
- 20.
- Eeg datasets, a medical dataset related to EEG measurements [81]. There are three different cases from this dataset used here denoted as Z_F_S, ZO_NF_S, ZONF_S.
- 21.
- Zoo dataset [82], used to classify animals.
- 1.
- Abalone dataset [83], used to to predict the age of abalone from physical measurements.
- 2.
- Airfoil dataset, derived from NASA [84].
- 3.
- Baseball dataset, used to estimate the salary of baseball players.
- 4.
- BK dataset [85], used to predict the points scored in a basketball game.
- 5.
- BL dataset, an electrical engineering dataset.
- 6.
- Concrete dataset [86].
- 7.
- Dee dataset, used to predict the price of electricity.
- 8.
- Diabetes dataset, a medical dataset.
- 9.
- Housing dataset, provided in [87].
- 10.
- FA dataset, used to predict the fit body fat.
- 11.
- MB dataset, available from from Smoothing Methods in Statistics [85].
- 12.
- MORTGAGE dataset, related to economic data from the USA.
- 13.
- PY dataset, (Pyrimidines problem) [88].
- 14.
- Quake dataset, used to predict the strength of earthquakes.
- 15.
- Treasure dataset, related to economic data from the USA.
- 16.
- Wankara dataset, a dataset related to weather.
3.2. Experimental Results
- 1.
- A genetic algorithm where the parameters have the values of Table 2 used to train a neural network with H hidden nodes. The results in the experimental tables are denoted by the label GENETIC.
- 2.
- The Adam optimization method is used to train a neural network with H hidden nodes. The column ADAM denotes the results for this method.
- 3.
- The RPROP method is used to train a neural network with H hidden nodes. The corresponding results are denoted by RPROP in the relevant tables.
- 4.
- The NEAT method (NeuroEvolution of Augmenting Topologies) [89], where the maximum number of allowed generations is the same as in the case of the genetic algorithm.
- 5.
- The proposed method (denoted as PROPOSED) was used with the experimental settings and are shown in Table 2.
- 6.
- An extra line was also added to the experimental tables under the title AVERAGE. This line represents the average classification or regression error for all datasets.
4. Conclusions
- 1.
- There is a need for more efficient techniques for initializing the value space for artificial neural network parameters. In the present work, the optimal result from the execution of a limited number of steps by a genetic algorithm was used as an initial estimate of the value interval.
- 2.
- In the present work, the same techniques as in any whole-chromosome genetic algorithm were used to perform the crossover and mutation operations. Research could be conducted at this point to find more focused crossover and mutation techniques for this particular problem.
- 3.
- The present technique consists of two phases, in each of which a problem-adapted genetic algorithm is executed. This means that significant computational time is required to complete the algorithm. However, since genetic algorithms are inherently parallelizable, modern parallel programming techniques could be used here.
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Bishop, C. Neural Networks for Pattern Recognition; Oxford University Press: Oxford, UK, 1995. [Google Scholar]
- Cybenko, G. Approximation by superpositions of a sigmoidal function. Math. Control. Signals Syst. 1989, 2, 303–314. [Google Scholar] [CrossRef]
- Baldi, P.; Cranmer, K.; Faucett, T.; Sadowski, P.; Whiteson, D. Parameterized neural networks for high-energy physics. Eur. Phys. J. C 2016, 76, 235. [Google Scholar] [CrossRef]
- Valdas, J.J.; Bonham-Carter, G. Time dependent neural network models for detecting changes of state in complex processes: Applications in earth sciences and astronomy. Neural Netw. 2006, 19, 196–207. [Google Scholar] [CrossRef] [PubMed]
- Carleo, G.; Troyer, M. Solving the quantum many-body problem with artificial neural networks. Science 2017, 355, 602–606. [Google Scholar] [CrossRef]
- Shirvany, Y.; Hayati, M.; Moradian, R. Multilayer perceptron neural networks with novel unsupervised training method for numerical solution of the partial differential equations. Appl. Soft Comput. 2009, 9, 20–29. [Google Scholar] [CrossRef]
- Malek, A.; Beidokhti, R.S. Numerical solution for high order differential equations using a hybrid neural network—Optimization method. Appl. Math. Comput. 2006, 183, 260–271. [Google Scholar] [CrossRef]
- Topuz, A. Predicting moisture content of agricultural products using artificial neural networks. Adv. Eng. Softw. 2010, 41, 464–470. [Google Scholar] [CrossRef]
- Escamilla-García, A.; Soto-Zarazúa, G.M.; Toledano-Ayala, M.; Rivas-Araiza, E.; Gastélum-Barrios, A. Applications of Artificial Neural Networks in Greenhouse Technology and Overview for Smart Agriculture Development. Appl. Sci. 2020, 10, 3835. [Google Scholar] [CrossRef]
- Shen, L.; Wu, J.; Yang, W. Multiscale Quantum Mechanics/Molecular Mechanics Simulations with Neural Networks. J. Chem. Theory Comput. 2016, 12, 4934–4946. [Google Scholar] [CrossRef]
- Manzhos, S.; Dawes, R.; Carrington, T. Neural network-based approaches for building high dimensional and quantum dynamics-friendly potential energy surfaces. Int. J. Quantum Chem. 2015, 115, 1012–1020. [Google Scholar] [CrossRef]
- Wei, J.N.; Duvenaud, D.; Aspuru-Guzik, A. Neural Networks for the Prediction of Organic Chemistry Reactions. ACS Cent. Sci. 2016, 2, 725–732. [Google Scholar] [CrossRef] [PubMed]
- Falat, L.; Pancikova, L. Quantitative Modelling in Economics with Advanced Artificial Neural Networks. Proc. Econ. Financ. 2015, 34, 194–201. [Google Scholar] [CrossRef]
- Namazi, M.; Shokrolahi, A.; Sadeghzadeh Maharluie, M. Detecting and ranking cash flow risk factors via artificial neural networks technique. J. Bus. Res. 2016, 69, 1801–1806. [Google Scholar] [CrossRef]
- Tkacz, G. Neural network forecasting of Canadian GDP growth. Int. J. Forecast. 2001, 17, 57–69. [Google Scholar] [CrossRef]
- Baskin, I.I.; Winkler, D.; Tetko, I.V. A renaissance of neural networks in drug discovery. Expert Opin. Drug Discov. 2016, 11, 785–795. [Google Scholar] [CrossRef]
- Bartzatt, R. Prediction of Novel Anti-Ebola Virus Compounds Utilizing Artificial Neural Network (ANN). World J. Pharm. Res. 2018, 2018 7, 16. [Google Scholar]
- Tsoulos, I.G.; Gavrilis, D.; Glavas, E. Neural network construction and training using grammatical evolution. Neurocomputing 2008, 72, 269–277. [Google Scholar] [CrossRef]
- Rumelhart, D.E.; Hinton, G.E.; Williams, R.J. Learning representations by back-propagating errors. Nature 1986, 323, 533–536. [Google Scholar] [CrossRef]
- Riedmiller, M.; Braun, H. A Direct Adaptive Method for Faster Backpropagation Learning: The RPROP algorithm. In Proceedings of the IEEE International Conference on Neural Networks, San Francisco, CA, USA, 28 March–1 April 1993; pp. 586–591. [Google Scholar]
- Kingma, D.P.; Ba, J.L. ADAM: A method for stochastic optimization. In Proceedings of the 3rd International Conference on Learning Representations (ICLR 2015), San Diego, CA, USA, 7–9 May 2015; pp. 1–15. [Google Scholar]
- Robitaille, B.; Payre, B.M. Modified quasi-Newton methods for training neural networks. Comput. Chem. Eng. 1996, 20, 1133–1140. [Google Scholar] [CrossRef]
- Sexton, R.S.; Alidaee, B.; Dorsey, R.E.; Johnson, J.D. Global optimization for artificial neural networks: A tabu search application. Eur. J. Oper. Res. 1998, 106, 570–584. [Google Scholar] [CrossRef]
- Yamazaki, A.; de Souto, M.C.P.; Ludermir, T.B. Optimization of neural network weights and architectures for odor recognition using simulated annealing. In Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN’02, Honolulu, HI, USA, 12–17 May 2002; Volume 1, pp. 547–552. [Google Scholar]
- Leung, F.H.F.; Lam, H.K.; Ling, S.H.; Tam, P.K. Tuning of the structure and parameters of a neural network using an improved genetic algorithm. IEEE Trans. Neural Netw. 2003, 14, 79–88. [Google Scholar] [CrossRef]
- Zhang, C.; Shao, H.; Li, Y. Particle swarm optimisation for evolving artificial neural network. In Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, Nashville, TN, USA, 8–11 October 2000; IEEE: Toulouse, France, 2000; pp. 2487–2490. [Google Scholar]
- Lonen, J.; Kamarainen, J.K.; Lampinen, J. Differential Evolution Training Algorithm for Feed-Forward Neural Networks. Neural Process. Lett. 2003, 17, 93–105. [Google Scholar]
- Salama, K.M.; Abdelbar, A.M. Learning neural network structures with ant colony algorithms. Swarm Intell. 2015, 9, 229–265. [Google Scholar] [CrossRef]
- Zhang, J.R.; Zhang, J.; Lok, T.M.; Lyu, M.R. A hybrid particle swarm optimization—Back-propagation algorithm for feedforward neural network training. Appl. Math. Comput. 2007, 185, 1026–1037. [Google Scholar] [CrossRef]
- Mishra, S.; Patra, S.K. Short Term Load Forecasting Using Neural Network Trained with Genetic Algorithm & Particle Swarm Optimization. In Proceedings of the 2008 First International Conference on Emerging Trends in Engineering and Technology, Nagpur, India; 2008; pp. 606–611. [Google Scholar] [CrossRef]
- Mirjalili, S.; Hashim, S.Z.M.; Sardroudi, H.M. Training feedforward neural networks using hybrid particle swarm optimization and gravitational search algorithm. Appl. Math. Comput. 2012, 218, 11125–11137. [Google Scholar] [CrossRef]
- Kobrunov, A.; Priezzhev, I. Hybrid combination genetic algorithm and controlled gradient method to train a neural network. Geophysics 2016, 81, 35–43. [Google Scholar] [CrossRef]
- Ivanova, I.; Kubat, M. Initialization of neural networks by means of decision trees. Knowl.-Based Syst. 1995, 8, 333–344. [Google Scholar] [CrossRef]
- Yam, J.Y.F.; Chow, T.W.S. A weight initialization method for improving training speed in feedforward neural network. Neurocomputing 2000, 30, 219–232. [Google Scholar] [CrossRef]
- Chumachenko, K.; Iosifidis, A.; Gabbouj, M. Feedforward neural networks initialization based on discriminant learning. Neural Netw. 2022, 146, 220–229. [Google Scholar] [CrossRef] [PubMed]
- Shahjahan, M.D.; Kazuyuki, M. Neural network training algorithm with possitive correlation. IEEE Trans. Inf. Syst. 2005, 88, 2399–2409. [Google Scholar] [CrossRef]
- Treadgold, N.K.; Gedeon, T.D. Simulated annealing and weight decay in adaptive learning: The SARPROP algorithm. IEEE Trans. Neural Netw. 1998, 9, 662–668. [Google Scholar] [CrossRef]
- Leung, C.S.; Wong, K.W.; Sum, P.F.; Chan, L.W. A pruning method for the recursive least squared algorithm. Neural Netw. 2001, 14, 147–174. [Google Scholar] [CrossRef]
- O’Neill, M.; Ryan, C. Grammatical evolution. IEEE Trans. Evol. Comput. 2001, 5, 349–358. [Google Scholar] [CrossRef]
- Backus, J.W. The Syntax and Semantics of the Proposed International Algebraic Language of the Zurich ACM-GAMM Conference. In Proceedings of the International Conference on Information Processing, UNESCO, Paris, France, 15–20 June 1959; pp. 125–132. [Google Scholar]
- Ryan, C.; Collins, J.; O’Neill, M. Grammatical evolution: Evolving programs for an arbitrary language. In Genetic Programming. EuroGP 1998; Lecture Notes in Computer Science; Banzhaf, W., Poli, R., Schoenauer, M., Fogarty, T.C., Eds.; Springer: Berlin/Heidelberg, Germany, 1998; Volume 1391. [Google Scholar]
- O’Neill, M.; Ryan, M.C. Evolving Multi-line Compilable C Programs. In Genetic Programming. EuroGP 1999; Lecture Notes in Computer Science; Poli, R., Nordin, P., Langdon, W.B., Fogarty, T.C., Eds.; Springer: Berlin/Heidelberg, Germany, 1999; Volume 1598. [Google Scholar]
- Ryan, C.; O’Neill, M.; Collins, J.J. Grammatical Evolution: Solving Trigonometric Identities. In Proceedings of the Mendel ’98: 4th International Conference on Genetic Algorithms, Optimization Problems, Fuzzy Logic, Neural Networks and Rough Sets, Anchorage, AK, USA, 4–9 May 1998; Volume 98. [Google Scholar]
- Puente, A.O.; Alfonso, R.S.; Moreno, M.A. Automatic composition of music by means of grammatical evolution. In Proceedings of the APL ’02: 2002 Conference on APL: Array Processing Languages: Lore, Problems, and Applications, Madrid, Spain, 22–25 July 2002; pp. 148–155. [Google Scholar]
- Campo, L.M.L.; Oliveira, R.C.L.; Roisenberg, M. Optimization of neural networks through grammatical evolution and a genetic algorithm. Expert Syst. Appl. 2016, 56, 368–384. [Google Scholar] [CrossRef]
- Soltanian, K.; Ebnenasir, A.; Afsharchi, M. Modular Grammatical Evolution for the Generation of Artificial Neural Networks. Evol. Comput. 2022, 30, 291–327. [Google Scholar] [CrossRef]
- Dempsey, I.; Neill, M.O.; Brabazon, A. Constant creation in grammatical evolution. Int. J. Innov. Appl. 2007, 1, 23–38. [Google Scholar] [CrossRef]
- Galván-López, E.; Swafford, J.M.; O’Neill, M.; Brabazon, A. Evolving a Ms. PacMan Controller Using Grammatical Evolution. In Applications of Evolutionary Computation. EvoApplications 2010; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2010; Volume 6024. [Google Scholar]
- Shaker, N.; Nicolau, M.; Yannakakis, G.N.; Togelius, J.; O’Neill, M. Evolving levels for Super Mario Bros using grammatical evolution. In Proceedings of the 2012 IEEE Conference on Computational Intelligence and Games (CIG), Granada, Spain, 11–14 September 2012; pp. 304–311. [Google Scholar]
- Martínez-Rodríguez, D.; Colmenar, J.M.; Hidalgo, J.I.; Micó, R.J.V.; Salcedo-Sanz, S. Particle swarm grammatical evolution for energy demand estimation. Energy Sci. Eng. 2020, 8, 1068–1079. [Google Scholar] [CrossRef]
- Sabar, N.R.; Ayob, M.; Kendall, G.; Qu, R. Grammatical Evolution Hyper-Heuristic for Combinatorial Optimization Problems. IEEE Trans. Evol. Comput. 2013, 17, 840–861. [Google Scholar] [CrossRef]
- Ryan, C.; Kshirsagar, M.; Vaidya, G.; Cunningham, A.; Sivaraman, R. Design of a cryptographically secure pseudo random number generator with grammatical evolution. Sci. Rep. 2022, 12, 8602. [Google Scholar] [CrossRef] [PubMed]
- Pereira, P.J.; Cortez, P.; Mendes, R. Multi-objective Grammatical Evolution of Decision Trees for Mobile Marketing user conversion prediction. Expert Syst. Appl. 2021, 168, 114287. [Google Scholar] [CrossRef]
- Castejón, F.; Carmona, E.J. Automatic design of analog electronic circuits using grammatical evolution. Appl. Soft Comput. 2018, 62, 1003–1018. [Google Scholar] [CrossRef]
- Kaelo, P.; Ali, M.M. Integrated crossover rules in real coded genetic algorithms. Eur. J. Oper. Res. 2007, 176, 60–76. [Google Scholar] [CrossRef]
- Powell, M.J.D. A Tolerant Algorithm for Linearly Constrained Optimization Calculations. Math. Program. 1989, 45, 547–566. [Google Scholar] [CrossRef]
- Kelly, M.; Longjohn, R.; Nottingham, K. The UCI Machine Learning Repository. 2023. Available online: https://archive.ics.uci.edu (accessed on 20 September 2023).
- Alcalá-Fdez, J.; Fernandez, A.; Luengo, J.; Derrac, J.; García, S.; Sánchez, L.; Herrera, F. KEEL Data-Mining Software Tool: Data Set Repository, Integration of Algorithms and Experimental Analysis Framework. J. -Mult.-Valued Log. Soft Comput. 2011, 17, 255–287. [Google Scholar]
- Weiss, S.M.; Kulikowski, C.A. Computer Systems That Learn: Classification and Prediction Methods from Statistics, Neural Nets, Machine Learning, and Expert Systems; Morgan Kaufmann Publishers Inc.: Burlington, MA, USA, 1991. [Google Scholar]
- Quinlan, J.R. Simplifying Decision Trees. Int. Man-Mach. Stud. 1987, 27, 221–234. [Google Scholar] [CrossRef]
- Shultz, T.; Mareschal, D.; Schmidt, W. Modeling Cognitive Development on Balance Scale Phenomena. Mach. Learn. 1994, 16, 59–88. [Google Scholar] [CrossRef]
- Zhou, Z.H.; Jiang, Y. NeC4.5: Neural ensemble based C4.5. IEEE Trans. Knowl. Data Eng. 2004, 16, 770–773. [Google Scholar] [CrossRef]
- Setiono, R.; Leow, W.K. FERNN: An Algorithm for Fast Extraction of Rules from Neural Networks. Appl. Intell. 2000, 12, 15–25. [Google Scholar] [CrossRef]
- Demiroz, G.; Govenir, H.A.; Ilter, N. Learning Differential Diagnosis of Eryhemato-Squamous Diseases using Voting Feature Intervals. Artif. Intell. Med. 1998, 13, 147–165. [Google Scholar]
- Kononenko, I.; Šimec, E.; Robnik-Šikonja, M. Overcoming the Myopia of Inductive Learning Algorithms with RELIEFF. Appl. Intell. 1997, 7, 39–55. [Google Scholar] [CrossRef]
- Hayes-Roth, B.; Hayes-Roth, B.F. Concept learning and the recognition and classification of exemplars. J. Verbal Learn. Verbal Behav. 1977, 16, 321–338. [Google Scholar] [CrossRef]
- French, R.M.; Chater, N. Using noise to compute error surfaces in connectionist networks: A novel means of reducing catastrophic forgetting. Neural Comput. 2002, 14, 1755–1769. [Google Scholar] [CrossRef]
- Dy, J.G.; Brodley, C.E. Feature Selection for Unsupervised Learning. J. Mach. Learn. Res. 2004, 5, 845–889. [Google Scholar]
- Perantonis, S.J.; Virvilis, V. Input Feature Extraction for Multilayered Perceptrons Using Supervised Principal Component Analysis. Neural Process. Lett. 1999, 10, 243–252. [Google Scholar] [CrossRef]
- Garcke, J.; Griebel, M. Classification with sparse grids using simplicial basis functions. Intell. Data Anal. 2002, 6, 483–502. [Google Scholar] [CrossRef]
- Elter, M.; Schulz-Wendtland, R.; Wittenberg, T. The prediction of breast cancer biopsy outcomes using two CAD approaches that both emphasize an intelligible decision process. Med Phys. 2007, 34, 4164–4172. [Google Scholar] [CrossRef]
- Little, M.A.; McSharry, P.E.; Hunter, E.J.; Spielman, J.; Ramig, L.O. Suitability of dysphonia measurements for telemonitoring of Parkinson’s disease. IEEE Trans. Biomed. Eng. 2009, 56, 1015–1022. [Google Scholar] [CrossRef] [PubMed]
- Smith, J.W.; Everhart, J.E.; Dickson, W.C.; Knowler, W.C.; Johannes, R.S. Using the ADAP learning algorithm to forecast the onset of diabetes mellitus. In Proceedings of the Symposium on Computer Applications and Medical Care; IEEE Computer Society Press: Piscataway, NJ, USA; American Medical Informatics Association: Bethesda, MD, USA, 1988; pp. 261–265. [Google Scholar]
- Lucas, D.D.; Klein, R.; Tannahill, J.; Ivanova, D.; Brandon, S.; Domyancic, D.; Zhang, Y. Failure analysis of parameter-induced simulation crashes in climate models. Geosci. Model Dev. 2013, 6, 1157–1171. [Google Scholar] [CrossRef]
- Giannakeas, N.; Tsipouras, M.G.; Tzallas, A.T.; Kyriakidi, K.; Tsianou, Z.E.; Manousou, P.; Hall, A.; Karvounis, E.C.; Tsianos, V.; Tsianos, E. A clustering based method for collagen proportional area extraction in liver biopsy images. In Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS, Milan, Italy, 25–29 August 2015; IEEE: Toulouse, France, 2015. November 2015; art. no. 7319047. pp. 3097–3100. [Google Scholar]
- Hastie, T.; Tibshirani, R. Non-parametric logistic and proportional odds regression. JRSS-C (Appl. Stat.) 1987, 36, 260–276. [Google Scholar] [CrossRef]
- Dash, M.; Liu, H.; Scheuermann, P.; Tan, K.L. Fast hierarchical clustering and its validation. Data Knowl. Eng. 2003, 44, 109–138. [Google Scholar] [CrossRef]
- Wolberg, W.H.; Mangasarian, O.L. Multisurface method of pattern separation for medical diagnosis applied to breast cytology. Proc. Natl. Acad. Sci. USA 1990, 87, 9193–9196. [Google Scholar] [CrossRef] [PubMed]
- Raymer, M.; Doom, T.E.; Kuhn, L.A.; Punch, W.F. Knowledge discovery in medical and biological datasets using a hybrid Bayes classifier/evolutionary algorithm. IEEE Trans. Syst. Man Cybern. Part B Cybern. Publ. IEEE Syst. Cybern. Soc. 2003, 33, 802–813. [Google Scholar] [CrossRef]
- Zhong, P.; Fukushima, M. Regularized nonsmooth Newton method for multi-class support vector machines. Optim. Methods Softw. 2007, 22, 225–236. [Google Scholar] [CrossRef]
- Andrzejak, R.G.; Lehnertz, K.; Mormann, F.; Rieke, C.; David, P.; Elger, C.E. Indications of nonlinear deterministic and finite-dimensional structures in time series of brain electrical activity: Dependence on recording region and brain state. Phys. Rev. E 2001, 64, 061967. [Google Scholar] [CrossRef] [PubMed]
- Koivisto, M.; Sood, K. Exact Bayesian Structure Discovery in Bayesian Networks. J. Mach. Learn. Res. 2004, 5, 549–573. [Google Scholar]
- Nash, W.J.; Sellers, T.L.; Talbot, S.R.; Cawthor, A.J.; Ford, W.B. The Population Biology of Abalone (_Haliotis_ species). In Tasmania. I. Blacklip Abalone (_H. rubra_) from the North Coast and Islands of Bass Strait, Sea Fisheries Division; In Blacklip Abalone (H. rubra) from the North Coast and Islands of Bass Strait; Tasmania, I., Ed.; Technical Report; Sea Fisheries Division: Tasmania, Australia, 1994. [Google Scholar]
- Brooks, T.F.; Pope, D.S.; Marcolini, A.M. Airfoil Self-Noise and Prediction; Technical Report, NASA RP-1218; NASA: Washington, DC, USA, 1989.
- Simonoff, J.S. Smooting Methods in Statistics; Springer: Berlin/Heidelberg, Germany, 1996. [Google Scholar]
- Yeh, I.C. Modeling of strength of high performance concrete using artificial neural networks. Cem. Concr. Res. 1998, 28, 1797–1808. [Google Scholar] [CrossRef]
- Harrison, D.; Rubinfeld, D.L. Hedonic prices and the demand for clean ai. J. Environ. Econ. Manag. 1978, 5, 81–102. [Google Scholar] [CrossRef]
- King, R.D.; Muggleton, S.; Lewis, R.; Sternberg, M.J.E. Drug design by machine learning: The use of inductive logic programming to model the structure-activity relationships of trimethoprim analogues binding to dihydrofolate reductase. Proc. Nat. Acad. Sci. USA 1992, 89, 11322–11326. [Google Scholar] [CrossRef] [PubMed]
- Stanley, K.O.; Miikkulainen, R. Evolving Neural Networks through Augmenting Topologies. Evol. Comput. 2002, 10, 99–127. [Google Scholar] [CrossRef]
- Cantu-Paz, E.; Goldberg, D.E. Efficient parallel genetic algorithms: Theory and practice. Comput. Methods Appl. Mech. Eng. 2000, 186, 221–238. [Google Scholar] [CrossRef]
- Harada, T.; Alba, E. Parallel genetic algorithms: A useful survey. ACM Comput. Surv. (CSUR) 2022, 53, 1–39. [Google Scholar] [CrossRef]
- Gropp, W.; Lusk, E.; Doss, N.; Skjellum, A. A high-performance, portable implementation of the MPI message passing interface standard. Parallel Comput. 1996, 22, 789–828. [Google Scholar] [CrossRef]
- Chandra, R.; Dagum, L.; Kohr, D.; Maydan, D.; Menon, J.M.R. Parallel Programming in OpenMP; Morgan Kaufmann Publishers Inc.: San Diego, CA, USA, 2001. [Google Scholar]
Expression | Chromosome | Operation |
---|---|---|
9,8,6,4,15,9,16,23,8 | 9 mod 2=1 | |
<expr>,<expr> | 8,6,4,15,9,16,23,8 | 8 mod 2=0 |
(<xlist>,<lcommand>,<rcommand>),<expr> | 6,4,15,9,16,23,8 | 6 mod 8=6 |
(x7,<lcommand>,<rcommand>),<expr> | 4,15,9,16,23,8 | 4 mod 3=1 |
(x7,EXPAND,<rcommand>),<expr> | 15,9,16,23,8 | 15 mod 3=0 |
(x7,EXPAND,NOTHING),<expr> | 9,16,23,8 | 9 mod 2 =1 |
(x7,EXPAND,NOTHING),(<xlist>,<lcommand>,<rcommand>) | 16,23,8 | 16 mod 8=0 |
(x7,EXPAND,NOTHING),(x1,<lcommand>,<rcommand>) | 23,8 | 23 mod 3=2 |
(x7,EXPAND,NOTHING),(x1,DIVIDE,<rcommand>) | 8 | 8 mod 3=2 |
(x7,EXPAND,NOTHING),(x1,DIVIDE,EXPAND) |
Parameter | Value |
---|---|
H | 10 |
200 | |
50 | |
200 | |
0.10 | |
0.01 |
Dataset | Genetic | Adam | Rprop | Neat | Proposed |
---|---|---|---|---|---|
Appendicitis | 18.10% (6.32) | 16.50% (7.73) | 16.30% (5.27) | 17.20% (4.12) | 17.00% (6.23) |
Australian | 32.21% (5.99) | 35.65% (5.83) | 36.12% (5.52) | 31.98% (6.03) | 24.55% (4.64) |
Balance | 8.97% (2.64) | 7.87% (3.09) | 8.81% (2.36) | 23.14% (4.16) | 16.71% (3.98) |
Cleveland | 51.60% (6.39) | 67.55% (6.98) | 61.41% (9.10) | 53.44% (7.26) | 47.91% (4.78) |
Dermatology | 30.58% (4.75) | 26.14% (3.11) | 15.12% (2.40) | 32.43% (4.74) | 8.93% (2.36) |
Hayes Roth | 56.18% (6.97) | 59.70% (5.41) | 37.46% (4.41) | 50.15% (4.43) | 32.21% (2.58) |
Heart | 28.34% (4.78) | 38.53% (4.45) | 30.51% (3.63) | 39.27% (4.14) | 17.40% (2.52) |
HouseVotes | 6.62% (2.11) | 7.48% (1.81) | 6.04% (1.17) | 10.89% (2.30) | 3.48% (1.43) |
Ionosphere | 15.14% (2.57) | 16.64% (3.20) | 13.65% (2.45) | 19.67% (4.28) | 7.14% (1.10) |
Liverdisorder | 31.11% (4.59) | 41.53% (4.74) | 40.26% (3.99) | 30.67% (3.12) | 28.90% (2.91) |
Lymography | 23.26% (3.84) | 29.26% (4.72) | 24.67% (3.48) | 33.70% (4.17) | 17.86% (2.42) |
Mammographic | 19.88% (2.79) | 46.25% (2.66) | 18.46% (2.34) | 22.85% (3.27) | 17.32% (1.79) |
Parkinsons | 18.05% (3.16) | 24.06% (3.28) | 22.28% (2.79) | 18.56% (1.87) | 14.35% (1.79) |
Pima | 32.19% (4.82) | 34.85% (4.26) | 34.27% (4.24) | 34.51% (4.67) | 25.58% (2.55) |
Popfailures | 5.94% (1.71) | 5.18% (1.79) | 4.81% (1.81) | 7.05% (2.87) | 4.58% (1.32) |
Regions2 | 29.39% (3.88) | 29.85% (3.95) | 27.53% (3.23) | 33.23% (4.41) | 28.32% (3.59) |
Saheart | 34.86% (4.90) | 34.04% (4.74) | 34.90% (4.75) | 34.51% (5.57) | 27.43% (3.88) |
Segment | 57.72% (2.71) | 49.75% (3.01) | 52.14% (4.85) | 66.72% (4.74) | 20.68% (2.17) |
Wdbc | 8.56% (2.90) | 35.35% (5.06) | 21.57% (4.55) | 12.88% (3.48) | 5.23% (1.66) |
Wine | 19.20 (2.66) | 29.40% (3.37) | 30.73% (3.78) | 25.43% (3.19) | 5.35% (1.74) |
Z_F_S | 10.73% (2.80) | 47.81% (5.75) | 29.28% (4.81) | 38.41% (6.18) | 6.56% (1.45) |
ZO_NF_S | 8.41% (2.35) | 47.43% (5.79) | 6.43% (2.35) | 43.75% (6.98) | 3.60% (1.05) |
ZONF_S | 2.60% (0.33) | 11.99% (1.19) | 27.27% (1.58) | 5.44% (1.11) | 2.21% (0.54) |
ZOO | 16.67% (3.28) | 14.13% (2.52) | 15.47% (3.10) | 20.27% (6.47) | 6.10% (1.67) |
AVERAGE | 23.60% | 31.54% | 25.65% | 29.42% | 16.23% |
Dataset | Genetic | Adam | Rprop | Neat | Proposed |
---|---|---|---|---|---|
ABALONE | 7.17 (1.11) | 4.30 (0.55) | 4.55 (0.75) | 9.88 (1.61) | 4.48 (0.52) |
AIRFOIL | 0.003 (0.002) | 0.005 (0.003) | 0.002 (0.001) | 0.067 (0.002) | 0.002 (0.001) |
BASEBALL | 103.60 (15.85) | 77.90 (16.59) | 92.05 (23.51) | 100.39 (22.54) | 51.39 (10.13) |
BK | 0.027 (0.009) | 0.03 (0.004) | 1.599 (0.15) | 0.15 (0.02) | 0.02 (0.005) |
BL | 5.74 (1.64) | 0.28 (0.11) | 4.38 (0.19) | 0.05 (0.02) | 0.002 (0.001) |
CONCRETE | 0.0099 (0.001) | 0.078 (0.013) | 0.0086 (0.001) | 0.081 (0.004) | 0.004 (0.0006) |
DEE | 1.013 (0.21) | 0.63 (0.08) | 0.608 (0.07) | 1.512 (0.79) | 0.23 (0.05) |
DIABETES | 19.86 (5.57) | 3.03 (0.35) | 1.11 (0.57) | 4.25 (1.92) | 0.41 (0.08) |
HOUSING | 43.26 (5.84) | 80.20 (8.82) | 74.38 (6.85) | 56.49 (5.65) | 24.55 (4.48) |
FA | 1.95 (0.44) | 0.11 (0.021) | 0.14 (0.01) | 0.19 (0.014) | 0.01 (0.005) |
MB | 3.39 (0.40) | 0.06 (0.03) | 0.055 (0.03) | 0.061 (0.03) | 0.048 (0.03) |
MORTGAGE | 2.41 (0.26) | 9.24 (1.24) | 9.19 (1.62) | 14.11 (3.34) | 0.65 (0.16) |
PY | 5.41 (0.63) | 0.09 (0.013) | 0.039 (0.019) | 0.075 (0.022) | 0.025 (0.012) |
QUAKE | 0.040 (0.004) | 0.06 (0.017) | 0.041 (0.006) | 0.298 (0.13) | 0.038 (0.005) |
TREASURY | 2.929 (0.69) | 11.16 (1.37) | 10.88 (1.26) | 15.52 (2.52) | 0.84 (0.29) |
WANKARA | 0.012 (0.005) | 0.02 (0.005) | 0.0003 (0.001) | 0.005 (0.001) | 0.0002 (0.0001) |
AVERAGE | 12.30 | 11.70 | 12.44 | 12.70 | 5.17 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Tsoulos, I.G.; Tzallas, A.; Karvounis, E. Constructing the Bounds for Neural Network Training Using Grammatical Evolution. Computers 2023, 12, 226. https://doi.org/10.3390/computers12110226
Tsoulos IG, Tzallas A, Karvounis E. Constructing the Bounds for Neural Network Training Using Grammatical Evolution. Computers. 2023; 12(11):226. https://doi.org/10.3390/computers12110226
Chicago/Turabian StyleTsoulos, Ioannis G., Alexandros Tzallas, and Evangelos Karvounis. 2023. "Constructing the Bounds for Neural Network Training Using Grammatical Evolution" Computers 12, no. 11: 226. https://doi.org/10.3390/computers12110226
APA StyleTsoulos, I. G., Tzallas, A., & Karvounis, E. (2023). Constructing the Bounds for Neural Network Training Using Grammatical Evolution. Computers, 12(11), 226. https://doi.org/10.3390/computers12110226