The Prediction of Swell Percent and Swell Pressure by Using Neural Networks

Expansive soils exhibit significantly high volumetric deformations and so pose a serious threat to stability of the structures and foundations. Thus, determination of their swelling properties (i.e. swelling potential and swell pressure) becomes essential. However, measurement of the swelling properties is time-consuming and requires special and expensive equipment. With this in view, efforts were made to develop artificial neural network (ANN) and multiple regression analysis (MRA) models that can be employed for estimating swell percent and swell pressure. To achieve this, the results of free swell tests performed on statically compacted specimens of Kaolinite-Bentonite clay mixtures with varying soil properties were used. Two different ANN (ANN-1 and ANN-2) and MRA (MRA-1 and MRA-2) models have been developed: ANN-1 and MRA-1 models for predicting swell percent and ANN-2 and MRA-2 models for predicting swell pressure. The results obtained from ANN and MRA models were compared vis-à-vis those obtained from the experiments. The values predicted from the ANN models match the experimental values much better than those obtained from MRA models. Moreover, several performance indices such as determination coefficient (R 2), variance account for (VAF), mean absolute error (MAE), and root mean square error (RMSE) were calculated to check the prediction capacity of the ANN and MRA models developed. The obtained indices make it clear that the constructed ANN models have shown higher prediction performance than MRA models. It has been demonstrated that the ANN models can be used satisfactorily to predict swell percent and swell pressure as a rapid inexpensive substitute for laboratory techniques.


INTRODUCTION
Expansive soils are that clay soils which exhibit significant volume changes because of soil moisture variation. Expansive soils are a worldwide problem that poses several challenges to civil engineers. Foundations constructed on these clays are subjected to large uplift forces caused by swelling, and inducing heaving, cracking, and break up of both building foundations and slabs on grade members. Heave problems account for more economic loss than all other soil problems. The cost of damages arising from expansive soil problems in the United States alone amounts to $2.3 billion annually [1].
The swelling of soils, in general, is due to the presence of expansive clay minerals, hydration of cations on clay surfaces, and release of intrinsic stresses caused by overconsolidation or dessication of soils [1]. Many investigations were carried out to analyze the factors affecting the swelling of clayey soils [2][3][4][5][6][7]. The major factors affecting the swelling of such soils are mainly concerned with the physical properties of the particles and the mass of soil, such as initial water content, type of clay mineral, initial dry density, clay content, of coarse grained fraction [2].
In the past few years, there has been a constant increase in the interest of neural network modeling in different fields of engineering science. In particular, artificial neural networks (ANNs) have been applied to many geotechnical engineering problems with success. Shahin et al. [8] gave a general overview of most ANN applications in the geotechnical engineering literature.
In this study, efforts were made to develop artificial neural network (ANN) and multiple regression analysis (MRA) models that can be employed for estimating swell percent and swell pressure. To achieve this, the results of free swell tests performed on statically compacted specimens of Kaolinite-Bentonite clay mixtures with varying soil properties [9] were used. Two different ANN models (ANN-1 and ANN-2) and MRA models (MRA-1 and MRA-2) were developed: ANN-1 and MRA-1 models for predicting swell percent and ANN-2 and MRA-2 models for predicting swell pressure. The results obtained from ANN and MRA models were compared vis-à-vis those obtained from the experiments. It is found that the values predicted from the ANN models match the experimental values much better than those obtained from MRA models. Moreover, several performance indices such as determination coefficient (R 2 ), variance account for (VAF), mean absolute error (MAE), and root mean square error (RMSE) were calculated to check the prediction capacity of the ANN and MRA models developed. Both ANN models have shown higher prediction performance than MRA models based on the performance indices.

ARTIFICIAL NEURAL NETWORKS
Artificial neural networks (ANNs) are computational model, which is based on the information processing system of the human brain [10]. The current interest in ANNs is largely due to their ability to mimic natural intelligence in its learning from experience [11]. Many authors have described the structure and operation of ANNs [12][13][14]. ANNs architectures are formed by three or more layers, which consist of an input layer, one or more hidden layers, and an output layer. Each layer consists of a number of interconnected processing elements (PEs), commonly referred to as neurons. The neurons interact with each other via weighted connections. Each neuron is connected to all the neurons in the next layer. In the input layer, data are presented to the network. The output layer holds the response of the network to the input. The hidden layers enable these networks to represent and compute complicated associations between inputs and outputs. This ANN architecture is commonly referred to as a fully interconnected feed-forward multi-layer perceptron (MLP). In addition, there is also a bias, which is only connected to the neurons in the hidden and output layers, with modifiable weighted corrections.
The number of hidden layers used depends on the degree of the complexity of the problem. ANNs with one or two hidden layers and adequate number of hidden neurons are found to be quite useful for most problems [15]. The number of neurons in the hidden layers depends on the nature of the problem. There are various methods to determine the number of neurons in the hidden layer [16][17][18]. However, these methods present general guidelines only for selection of an adequate number of neurons.
The neural network "learns" by modifying the weights of the neurons in response to the errors between the actual output values and the target output values. Several learning algorithms have been developed. The back-propagation learning algorithm is the most commonly used neural network algorithm [19]. The backpropagation neural network has been applied with great success to model many phenomena in the field of geotechnical engineering [8 and 20]. In the back-propagation neural network, learning is carried out through gradient descent on the sum of the squares of the errors for all the training patterns [20]. Each neuron in a layer receives and processes weighted inputs from neurons in the previous layer and transmits its output to neurons in the following layer through links. Each link is assigned a weight which is a numerical estimate of the connection strength. The weighted summation of inputs to a neuron is converted to an output according to a nonlinear transfer function. The common transfer function widely used in the literature is the sigmoid function. The changes in the weights are proportional to the negative of the derivative of the error term. One pass through the set of training patterns, together with the associated updating of the weights, is called a cycle or an epoch. Training is carried out by repeatedly presenting the entire set of training patterns (updating the weights at the end of the each epoch) until the average sum squared error over all the training patterns is minimal and within the tolerance specified for the problem.
At the end of the training phase, the neural network should correctly reproduce the target output values for the training data; provided errors are minimal (i.e., convergence occurs). The associated trained weights of the neurons are then stored in the neural network memory. In the next phase, the trained neural network is fed a separate set of data. In this testing phase, the neural network predictions using the trained weights are compared to the target output values. The performance of the overall ANN model can be assessed by several criteria [10 and 21]. These criteria include coefficient of determination R 2 , mean squared error, mean absolute error, minimal absolute error, and maximum absolute error. A well-trained model should result in an R 2 close to 1 and small values of error terms.
In this study, determination of swell percent and swell pressure has been modeled using the ANN in which network training was accomplished with the neural network toolbox written in Matlab environment (Math Works 7.0 Inc. 2006) and the Levenberg-Marquardt back-propagation learning algorithm [22] was used in the training stage. Details of the experimental investigations, which have yielded the data for these models, are presented in the following section.

EXPERIMENTAL INVESTIGATIONS
To obtain clays possessing a wide range of plasticity index, the commercially processed kaolinite and bentonite mineral clays were mixed in preselected proportions.
The properties of kaolinite and bentonite used are given in Table 1. In Table 1, G s is the specific gravity, C is the clay percent, LL is the liquid limit, PL is the plastic limit, I p is the plasticity index and CEC is the cation exchange capacity. The composition and the properties of the four clay mixtures obtained are shown in Table 2. Free swell tests (ASTM D-4546-85) [23] were performed on statically compacted samples of the clay mixtures with initial water contents w of ranging from 10 % to 27% and having initial dry unit weights  dry of ranging from 14.0 to 17.9 kN/m 3 in conventional oedometer cells; and swell percent, S, and the swell pressure, S p , of each specimen, possessing plasticity indices, initial water contents and initial dry unit weights were determined. Tables 3 and 4 present the free swell test results of the clay mixtures 1 and 2 and clay mixtures 3 and 4, respectively, which is the database of the neural network models developed in this study.

DEVELOPMENT OF ANN MODEL FOR PREDICTION OF SWELL PERCENT
An ANN model (ANN-1) is designated to predict swell percent (S) from the soil properties. The inputs used in the model are the soil properties such as clay percent (C), cation exchange capacity (CEC), plasticity index (I p ), dry unit weight ( dry ), and water content (w), the output is the swell percent (S) of the specimen. The boundaries for input and output parameters of the ANN-1 model are listed in Table 5. The input and output data were scaled to lie between 0 and 1, by using Eq. 1. In Eq. 1, where x norm is the normalized value, x is the actual value, x max is the maximum value and x min is the minimum value. It is a common practice to divide the available data into two subsets; a training set, to construct the neural network model, and an independent validation set to estimate model performance in the deployed environment [24]. However, dividing the data into only two subsets may lead to model overfitting. Overfitting makes multi-layer perceptrons (MLPs) memorize training patterns in such a way that they cannot generalize well to new data [10]. As a result, crossvalidation technique [25] was used as the stopping criterion in this study. In this technique, the database is divided into three subsets: training, validation and testing. The training set is used to update networks' weights. During this process the error on the validation set is monitored. When the error on the validation set begins to increase, the training is stopped because it is considered to be the best point of generalization. Finally, testing data is fed into the networks to evaluate their performance. Therefore, in total, 56% of the data were used for training, 24% for testing, and 20% for validation. The neural network toolbox of MATLAB7.0, a popular numerical computation and visualization software [26], was used for training, validation, and testing of MLPs. Firstly, one hidden layer was chosen. Then, the optimum number of neurons in the hidden layer of the model was determined by varying their number starting with a minimum of 1 then increasing the network size up to (2I +1), (I is the number of input variables), in steps by adding 1 neuron each time. It should be noted that (2I +1) is the upper limit for the number of hidden layer neurons needed to map any continuous function work with I inputs, as discussed by Caudill (1988) [27]. Different transfer functions (such as log-sigmoid [28] and tan-sigmoid [15]) were investigated to achieve the best performance in training as well as in testing. Two momentum factors, 0.01 and 0.001) were selected for the training process to search for the most efficient ANN architecture. The coefficient of determination, R 2 , and the mean absolute error, MAE, were used to evaluate the performance of the developed ANN models. The performance of the network during the training and testing processes was examined for each network size until no significant improvement occurred. The optimal ANNs performance was obtained with the model having 8 neurons in the hidden layer, 14 epochs, a 0.001 momentum factor, a log-sigmoid transfer (activation) function in the neurons of the hidden layer and in the neuron of the output layer.

DEVELOPMENT OF ANN MODEL FOR PREDICTION OF SWELL PRESSURE
An ANN model (ANN-2) is designated to predict swell pressure (S p ) from the soil properties such as clay percent (C), cation exchange capacity (CEC), plasticity index (I p ), dry unit weight ( dry ), water content (w), and swell percent (S). For this purpose, an ANN architecture with six inputs and one output was constructed. The boundaries for input and output parameters of the ANN-2 model are listed in Table 6. The input and output data were scaled to lie between 0 and 1, by using Eq. 1.
Crossvalidation technique [25] was used as the stopping criterion as modeling of swell percent (see Section 4). Therefore, in total, 56% of the data were used for training, 24% for testing, and 20% for validation. Firstly, one hidden layer was chosen. Then, two hidden layers were chosen for better performance. During the design of optimal ANNs, the trials were formed similar to the trials made in modeling of swell percent. The optimal ANNs performance was obtained with the model having 2 hidden layers, 8 neurons in the hidden layers, 16 epochs, a 0.001 momentum factor, a log-sigmoid transfer function in the neurons of the hidden layers and in the neuron of the output layer.

DEVELOPMENT OF MULTIPLE REGRESSION ANALYSIS MODELS FOR THE PREDICTION OF SWELL PRECENT AND SWELL PRESSURE
Multiple regression analysis (MRA) was performed to predict swell percent, S, and swell pressure, S p. To achieve this, two MRA models (MRA-1 and MRA-2) were developed by using SPSS 8.0.0. The experimental data (refer Tables 1 and 2) were used in the development of these models. C, CEC, PI, w, and  dry values were included in the model MRA-1, which yields Eq. 2. C, CEC, PI, w, and  dry and S values were included in the model MRA-2, which yields Eq. 3.

RESULTS AND DISCUSSION
A comparison of experimental results with the results obtained from ANN-1 model is depicted in Fig.1 for training, validation, and testing samples. It can be noted from the figure that S values obtained from the ANN model are quite close to the experimentally obtained S values, as their R 2 values are much close to unity. Therefore, it is concluded that the swell percent of clay soils included in this study could be predicted from easily determined soil properties using trained ANNs values. Similar ANN models could also be developed for other materials using the same input parameters.   Fig. 2 for training, validation, and testing samples. It can be noted from the figure that S p values obtained from the ANN model are in good agreement with the experimentally obtained S p values, as R 2 of 0.9855, 0.9846 and 0.9222 for training, validation and testing samples, respectively. Therefore, it is concluded that the swell pressure of clay soils included in this study could be predicted using trained ANNs values. If swell percent values are not available, swell percent values could be predicted by using trained ANN structure in the ANN-1 model. Similar ANN models could also be developed for other materials using the same input parameters.  Fig.3 for all samples. It can be noticed from Figure  3(a) that S values obtained from the MRA-1 model are in good agreement with the experimentally obtained S values, as R 2 of 0.9511. Thus, it is concluded that the swell pressure of clay soils included in this study could be also predicted from easily determined soil properties by using Eq. (2). Similar MRA models could also be developed for other materials using the same input parameters. It can be noted from Figure 3(b) that the MRA-2 model yields poor predictions, as R 2 of 0.6687. Thus, Eq.
(3) is not recommended for routine engineering applications.
In fact, the coefficient of correlation between the measured and predicted values is a good indicator to check the prediction performance of the model. In this study, variance VAF, represented by Eq. (4), and the root mean square error RMSE, represented by Eq. (5), were also computed to assess the performance of the developed models [29][30][31][32].
where var denotes the variance, y is the measured value, ŷ is the predicted value, and N is the number of the sample. If VAF is 100 % and RMSE is 0, the model is treated as excellent.  Table 7. Both ANN models have exhibited higher prediction performance than MRA models based on the performance indices in Table 7. The values of the indices obtained prove that the predictive models constructed are quite powerful.

CONCLUDING REMARKS
In this study, efforts were made to develop artificial neural network (ANN) and multiple regression analysis (MRA) models that can be employed for estimating swell percent and swell pressure. For this purpose, the results of free swell tests performed on statically compacted specimens of Kaolinite-Bentonite clay mixtures with varying soil properties were used.
Two different ANN models (ANN-1 and ANN-2) and MRA models (MRA-1 and MRA-2) were developed: ANN-1 and MRA-1 models for predicting swell percent and the others (ANN-2 and MRA-2 models) for predicting swell pressure. The input parameters used in the ANN-1 and MRA-1 models are the soil properties such as clay percent, cation exchange capacity, plasticity index, water content and dry unit weight while in the ANN-2 and MRA-2 models they are the same soil properties and swell percent. The results obtained from ANN and MRA models were compared vis-à-vis those obtained from the experiments. It is found that the values predicted from the ANN models match the experimental values much better than those obtained from MRA models. Therefore, the swell percent and swell pressure of clay soils included in this study could be predicted using trained ANN structures as an inexpensive substitute for the laboratory testing, quite easily and efficiently. Similar ANN models could also be developed for other materials by using the same input parameters.
To check the prediction performance of the ANN and MRA models developed, several performance indices such as R 2 , VAF, MAE, and RMSE were calculated. Both ANN models have shown higher prediction performance than MRA models based on the performance indices. The performance level attained in the ANN models has also shown that the neural network is a useful tool to minimize the uncertainties encountered during the soil engineering projects. For this reason, the use of neural network may provide new approaches and methodologies.