^{*}

This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).

This paper introduces a comparison of training algorithms of radial basis function (RBF) neural networks for classification purposes. RBF networks provide effective solutions in many science and engineering fields. They are especially popular in the pattern classification and signal processing areas. Several algorithms have been proposed for training RBF networks. The Artificial Bee Colony (ABC) algorithm is a new, very simple and robust population based optimization algorithm that is inspired by the intelligent behavior of honey bee swarms. The training performance of the ABC algorithm is compared with the Genetic algorithm, Kalman filtering algorithm and gradient descent algorithm. In the experiments, not only well known classification problems from the UCI repository such as the Iris, Wine and Glass datasets have been used, but also an experimental setup is designed and inertial sensor based terrain classification for autonomous ground vehicles was also achieved. Experimental results show that the use of the ABC algorithm results in better learning than those of others.

Agricultural applications, search/rescue missions, surveillance, supply and logistics are some of the operational fields of autonomous ground vehicles. These operations may necessitate for traversing some off-road or indoor terrains that can have an effect on the vehicle performance [

A radial basis function (RBF) network is a special type of neural network that uses a radial basis function as its activation function [

In RBF networks, determination of the number of neurons in the hidden layer is very important because it affects the network complexity and the generalizing capability of the network. If the number of the neurons in the hidden layer is insufficient, the RBF network cannot learn the data adequately; on the other hand, if the neuron number is too high, poor generalization or an overlearning situation may occur [

In the literature, various algorithms are proposed for training RBF networks, such as the gradient descent (GD) algorithm [

The ABC algorithm is a population based evolutional optimization algorithm that can be applied to various types of problems. The ABC algorithm is used for training feed forward multi-layer perceptron neural networks by using test problems such as XOR, 3-bit parity and 4-bit encoder/decoder problems [

The rest of the paper is organized as follows: Section 2 presents

Neural networks are non-linear statistical data modeling tools and can be used to model complex relationships between inputs and outputs or to find patterns in a dataset. RBF network is a type of feed forward neural network composed of three layers, namely the input layer, the hidden layer and the output layer. Each of these layers has different tasks [

In RBF networks, the outputs of the input layer are determined by calculating the distance between the network inputs and hidden layer centers. The second layer is the linear hidden layer and outputs of this layer are weighted forms of the input layer outputs. Each neuron of the hidden layer has a parameter vector called center. Therefore, a general expression of the network can be given as [

The norm is usually taken to be the Euclidean distance and the radial basis function is also taken to be Gaussian function and defined as follows:

Number of neurons in the hidden layer
Number of neurons in the output layer
_{ij}Weight of the
^{th} neuron and ^{th} output
Radial basis function
_{i}Spread parameter of the
^{th} neuron
Input data vector
_{i}Center vector of the
^{th} neuron
_{j}Bias value of the output
^{th} neuron
_{j}Network output of the
^{th} neuron

_{1}_{m}_{1}_{i}_{11}_{ij}_{1}_{j}_{1}_{j}

The design procedure of the RBF neural network includes determining the number of neurons in the hidden layer. Then, in order to obtain the desired output of the RBF neural network

Here _{j}_{j}

This section gives brief descriptions of training algorithms of RBF networks which were used in this paper for comparison purposes. The artificial bee colony (ABC) algorithm, which is newly applied to RBF training, is explained in detail in Section 4.

GD is a first-order derivative based optimization algorithm used for finding a local minimum of a function. The algorithm takes steps proportional to the negative of the gradient of the function at the current point. In [

The state of a linear dynamic system can be efficiently estimated by Kalman filter from a series of noisy measurements. It is used in a wide range of engineering applications from INS/GPS integration [_{k}_{k}_{k}_{k}

KF can be used to optimize weight matrix and center vectors of RBF as a least squares minimization problem. In a RBF network, _{k}

The vector

For a detailed explanation of the use of GD and KF for RBF training see [

GA is an optimization method used in many research areas to find exact or approximate solutions to optimization and search problems. Inheritance, mutation, selection and crossover are the main aspects of GA that inspired from evolutionary biology. The population refers to the candidate solutions. The evolution starts from a randomly generated population. For all generations, the fitness (typically a cost function) of every individual in the population is evaluated and the genetic operators are implemented to obtain a new population. In the next iteration the new population is then used. Frequently, GA terminates when either a maximum number of generations has been reached, or a predefined fitness value has been achieved. In RBF training, the individuals consist of the RBF parameters such as weights, spread parameters, center vectors and bias parameters. The fitness value of an individual can be evaluated using an error function such as MSE or SSE of the desired output and the actual output of the network.

The Artificial Bee Colony algorithm is a heuristic optimization algorithm proposed by Karaboga in 2005 [

In the ABC algorithm, a food source indicates a possible solution of the optimization problem and the nectar amount of the food source indicates the fitness value of that food source. The number of worker bees corresponds to the possible solutions. First, an initial randomly distributed population is generated. After initialization, a search cycle of the worker, onlooker and scout bees in the population is repeated, respectively. A worker bee changes the food source and discovers a new food source. If the nectar amount of the new source is more than the old one, the worker bee learns the new position instead of the old one. Otherwise it keeps the old position. After all the worker bees complete the search process; they share the position information with the onlooker bees. Onlooker bees evaluate the nectar amounts and choose a food source. The probability value of a food source is calculated by using:
_{i}_{i}

ABC algorithm uses _{ij}_{ij} As can be seen from _{ij} and x_{kj} decreases, the step size get a decrease, accordingly. Therefore, the step size is adaptively modified while the algorithm reaches the optimal solution in the search area. The food source which does not progress for a certain number of cycles is abandoned. This cycle number is very important for ABC algorithm and is called “limit”. Control parameters of the ABC algorithm are number of the source (

As mentioned before, training of an RBF neural network can be obtained with the selection of the optimal values for the following parameters:

weights between the hidden layer and the output layer (

spread parameters of the hidden layer base function (

center vectors of the hidden layer (

bias parameters of the neurons of the output layer (

The number of neurons in the hidden layer is very important in neural networks. Using more neurons than that is needed causes an overlearned network and moreover, increases the complexity. Therefore, it has to be investigated how the numbers of neurons affect the network’s performance.

The individuals of the population of ABC include the parameters of the weight (

The quality of the individuals (possible solutions) can be calculated using an appropriate cost function. In the implementation, SSE between the actual output of the RBF network and the desired output is adopted as the fitness function:

The experiments conducted in this study are divided into two sub-sections. First, comparisons on well-known classification problems obtained from UCI repository are given. The second part of this section deals with terrain classification problem using inertial sensors in which the data was obtained experimentally.

In the experiments of test problems, the performance of RBF network trained by using ABC is compared with GA, KF and GD algorithms. The well-known classification problems—Iris, Wine and Glass—which were obtained from UCI repository [

For all datasets, experiments are repeated 30 times. For each run, datasets are randomly divided into train and test subsets. 60% of the data set is randomly selected as the training data and remained data set is selected as the testing data. Afterwards, average and standard deviation of the 30 independent runs are calculated. General characteristics of the test and train dataset are illustrated in

One of the most important design issue of an RBF network is the number of the neurons in the hidden layer. Therefore, the experiments are conducted on different RBF networks which has 1 neuron to 8 neurons located in the hidden layer.

In the experiments, learning parameter of GD is selected as

In the experiments percent of correctly classified samples (PCCS) metric is used as the performance measure:

The statistical results of 30 replicates are given at

Gradient descent algorithm is a traditional derivative based method that is used for training RBF networks [

Kalman filtering is another derivative based method and several studies are performed for training RBF network with Kalman filtering [

GA is population based evolutional optimization algorithm and it has been used for training RBF network in several studies [^{®} is used to perform the experiments with GA. Population size and generation/cycle count are the same with ABC settings for comparing the training performances.

ABC is an evolutional optimization algorithm that is inspired by the foraging behavior of honey bees. It is a very simple and robust optimization technique. The results show that the ABC algorithm is more robust than GA because of the changes in standard deviations. Average results of the training results show that the ABC algorithm is better than the other methods.

As can be seen from

In addition, the number of neurons affects the network performance. As can be seen from the figures, as the number of neurons increases, the performance of the network does not increase accordingly. However, in the experiments it is realized that using three neurons gives acceptable results for the ABC algorithm. Other algorithms can reach the same performance by using more than three neurons. Since the number of neurons directly influences the time complexity of the algorithm, the required minimum number of neurons has to be used in the applications. In this context, the ABC algorithm is better than the others.

In this section, a terrain classification experiment is presented. The goal of this experiment is to identify the type of the terrain being traversed, from among a list of candidate terrains. Our proposed terrain classification system uses typically available an inertial measurement unit (IMU): XSens MTi-9 [

The experimental platform is shown in

In this study, we try to identify which one of the four different candidate terrains the vehicle travelled on: pavement, asphalt, grass and tile. Our hypothesis is that the vibrations of different terrains influence the output of the IMU sensor. The data sampled at 100 Hz for 80 seconds duration for each terrain type. Afterwards, the data is preprocessed before classification using proposed RBF scheme. Outdoor terrain types analyzed in this study are pavement, asphalt and grass. For indoor applications tile floor is used. The terrain types are shown in

Previous experiments have shown that RBF networks are efficient classifiers. In this experiment the proposed RBF scheme is also used for terrain classification. RBF network structure for this problem has four outputs that help to identify the terrain type. Each output can vary from zero to one, in proportion to the likelihood that a given signal presented in the input of the RBF network belongs to one of the four subject terrains: pavement, asphalt, grass or tile. The RBF has

The ^{2}),

Discrete Fourier transform (DFT) is performed on the inertial data obtained from Xsens IMU. The sensor acquires data at 100 Hz. Therefore, the input signal at fixed intervals of 100 samples (1 second duration) is obtained and then DFT of the signal is computed. In the experiments, we use the first

Terrain classification dataset contains five inputs and four outputs. The dataset has 320 total samples. As in the previous experiment, this experiment is also repeated 30 times. For each run, 60% of the dataset is randomly divided into train and the rest is selected as test subsets. Average and standard deviation PCCS results of the 30 independent runs are given in

As can be seen from

In this study, the Artificial Bee Colony (ABC) Algorithm, which is a new, simple and robust optimization algorithm, has been used to train radial basis function (RBF) neural networks for classification purposes. First, well-known classification problems obtained from UCI repository have been used for comparison. Then, an experimental setup has been designed for an inertial sensor based terrain classification. Training procedures involves selecting the optimal values of the parameters such as weights between the hidden layer and the output layer, spread parameters of the hidden layer base function, center vectors of the hidden layer and bias parameters of the neurons of the output layer. Additionally, number of neurons in the hidden layer is very important for complexity of the network structure. The performance of the proposed algorithm is compared with the traditional GD, and novel KF and GA methods. GD and KF methods are derivative based algorithms. Trapping a local minimum is a disadvantage for these algorithms. The GA and ABC algorithms are population based evolutional heuristic optimization algorithms. These algorithms show better performance than derivative based methods for well known classification problems such as Iris, Glass, and Wine and also for experimental inertial sensor based terrain classification. However, these algorithms have the disadvantage of a slow convergence rate. If the classification performances are compared, experimental results show that the performance of the ABC algorithm is better than those of the others. The success of the classification results of test problems are superior and also correlates with the results of many papers in the literature. In real-time applications, number of neurons may affect the time complexity of the system. For terrain classification problem, it is proved that inertial measurement units can be used to identify the terrain type of a mobile vehicle. The results of terrain classification problem are reasonable and may help to the planning algorithms of autonomous ground vehicles.

The main contributions of this study can be summarized as:

Training algorithms of RBF networks significantly affect the performance of the classifier.

Complexity of the RBF network is increased with the number of neurons in the hidden layer.

GD and KF offer faster training but tolerable classification performance. GA and ABC show better performance than others.

ABC is the applied for the first time to RBF training for classification problems in this study.

ABC is more robust and requires less control parameters than other training algorithms.

ABC reaches the best score of GD and KF using only two neurons, while GD and KF use eight neurons. Therefore, the complexity of the RBF-ABC scheme is much less than those of others in the real-time usage after training.

Proposed RBF structure includes the training of spread parameters for each hidden neuron and bias parameters for the output layer which is also newly applied for RBF training.

Terrain classification by using an inertial sensor and RBF network is achieved with 80% success rate.

This work was supported by Turkish Scientific and Research Council (TUBITAK) under Grant No. 107Y159. The authors would like to thank Prof. Dr. Dervis Karaboga and anonymous reviewers for their useful comments and suggestions that improved the quality of this paper.

Block diagram of a RBF network.

Network architecture of the RBF.

Basic flowchart of the ABC algorithm.

Experimental setup.

Terrain types: (a) pavement, (b) asphalt, (c) grass and (d) tile.

Preprocessed input data for terrain classification.

Evaluation CPU time of the RBF network.

Characteristics of the UCI dataset.

4 | 3 | 150 | 90 | 60 | |

13 | 3 | 178 | 106 | 72 | |

9 | 2 | 214 | 128 | 86 |

Control parameters of GA.

Number of generations | 4,000 |

Selection type | Roulette |

Mutation type | Uniform |

Mutation rate | 0.05 |

Crossover type | Single point |

Crossover ratio | 0.8 |

Control parameters of ABC.

Number of generations/cycles | 4,000 |

Limit (ABC) | 400 |

Statistical PCCS results of Iris dataset.

Train | 65,9 (2,7) | 85,3 (6,6) | 93,4 (3,4) | 95,3 (3,4) | 95,2 (2,7) | 95,2 (7,2) | 97,0 (3,0) | 97,7 (1,3) | |

Train | 60,7(9,2) | 66,4 (10,8) | 81,6 (8,4) | 84,5 (12,8) | 88,3 (10,7) | 91,6 (8,6) | 94,2 (4,6) | 95,6 (4,2) | |

Train | 63,5 (12,3) | 89,9 (8,6) | 94,1 (3,8) | 96,1 (2,0) | 96,0 (1,7) | 96,6 (1,9) | 97,4 (1,4) | 97,1 (1,1) | |

Train | |||||||||

Statistical PCCS results of Wine dataset.

54,5 (13,0) | 74,5 (18,1) | 83,6 (13,5) | 89,9 (7,0) | 92,1 (10,1) | 95,1 (4,4) | 97,0 (2,6) | 97,1 (2,0) | ||

52,9(8,4) | 56,1 (14,6) | 72,5 (17,9) | 85,6 (14,5) | 83,4 (15,1) | 90,5 (13,3) | 91,3 (15,4) | 97,1 (3,0) | ||

68,7 (7,8) | 96,9 (2,8) | 98,8 (1,0) | 98,8 (1,3) | 99,1 (1,0) | 99,5 (0,8) | 99,6 (0,6) | 99,8 (0,5) | ||

Statistical PCCS results of Glass dataset.

84,8 (5,4) | 88,3 (5,4) | 90,6 (2,1) | 90,8 (2,2) | 92,8 (2,4) | 93,0 (2,5) | 91,7 (3,1) | 91,8 (2,8) | ||

76,6 (3,6) | 84,8 (6,7) | 89,6 (5,9) | 91,0 (4,2) | 92,0 (3,6) | 94,1 (2,7) | 93,8 (2,2) | 94,7 (3,1) | ||

82,6 (6,8) | 92,6 (2,3) | 94,0 (1,5) | 94,7 (1,6) | 95,4 (1,5) | 96,2 (1,2) | 95,9 (1,9) | 96,5 (1,8) | ||

Statistical PCCS results of Terrain Classification dataset.

42,2 (6,1) | 50,1 (6,6) | 59,2 (8,3) | 58,6 (7,3) | 65,8 (5,0) | 65,5 (5,8) | 70,3 (5,2) | 70,9 (6,0) | ||

36,8 (7,1) | 46,7 (6,9) | 45,1 (11,8) | 58,0 (10,0) | 62,4 (12,3) | 63,2 (11,7) | 63,7 (11,7) | 68,4 (9,4) | ||

42,0 (5,5) | 55,7 (6,9) | 68,6 (4,3) | 71,4 (3,2) | 70,2 (3,5) | 74,1 (3,1) | 77,0 (3,3) | 74,2 (2,7) | ||