1. Introduction
Soil unit weight is a geotechnical parameter that is necessary to determine, among other things, the design bearing resistance of soils in drained conditions and the vertical component of geostatic stress. Nowadays, the basic and most reliable method to estimate this parameter is by direct tests conducted in laboratory conditions. However, these tests always require the highest quality samples with undisturbed structures, collection of which is very difficult, time-consuming and expensive. Alternative methods to determine the unit weight of soil in a direct way, in situ, using various types of penetrometers, mainly based on cone penetration test (CPTM) static probes, have been sought for a long time [
1,
2,
3,
4,
5,
6]. Despite the widespread use of various types of penetrometers in engineering practice and the development of appropriate calculation formulas, their results have not always been strictly verified by laboratory methods; therefore, they cannot be treated as universal in relation to all types of soil, especially organic soil. Preliminary analyses in this area have shown that, at present, in relation to selected, locally-occurring, Polish organic soils, existing global relationships are of little use, and it is advisable to search for other, more accurate solutions [
7].
Alternatively, from a practical point of view, these methods seem to be designed to determine soil unit weight based on the research of samples with a disturbed structure; however, they are designed for soil with natural moisture, which is cheaper and relatively easier to collect. This is particularly important in the case of organic soils, which are extremely heterogeneous and have a locally diverse research medium. In addition, packages of organic soil are very often below the groundwater table, which makes it difficult, or even impossible, to collect samples of a sufficient quality with undisturbed structures. The data obtained from the testing of samples with disturbed structures of organic soils from local sites have these basic parameters: natural water content and organic matter content can be used to estimate the value of unit weight, while also regionalizing the parameters of a given type of organic soil.
This paper presents a proposal to estimate the unit weight of various, local organic soils based on laboratory-determined leading parameters, as well as the results of alternative prognoses using artificial neural networks (ANN). In recent years, the application of ANNs in structural mechanics and civil engineering, including geotechnics, has increased. Examples of the use of ANNs in geotechnics in Poland can be found in the recent work presented by Sulewska [
8,
9,
10,
11,
12], Lechowicz [
13], Ochmański and Bzówka [
14,
15,
16] and others [
17,
18,
19,
20,
21]. This paper shows the application of ANNs, in comparison to the standard regression.
The choice of the ANN analysis tool was determined by the fact that it is a very universal tool that is increasingly finding an effective solution in the field of geotechnics [
22,
23]. Artificial intelligence is used, among others, for soil classification [
24], prediction of geotechnical properties [
25], prediction of slope performance [
26], calculations of the settlements of buildings and constructions [
27], indirect estimation of rock parameters [
28], the design and analysis of deep foundations [
29], empirical design in geotechnics [
30] or geomaterial modeling [
31] and many other applications.
Unfortunately, available source materials on the use of ANNs to predict the geotechnical parameters of organic soils for the practical design of foundations are rare and difficult to access, which was an additional argument to apply and verify the tool in this work.
3. Results
Based on the analysis of the exploratory research carried out in the study area at the Rzeszów site, with the CPTM probe used at a maximum depth 15.4 m below ground level, frequently occurring and varied organic soils were identified in the subsoil. The test results show the values of the cone resistance
qc and sleeve friction
fs, in addition to the friction ratio
Rf, which was used for the classification of the organic soil type using known criteria (p.2.1), as presented in the diagrams in
Figure 3,
Figure 4,
Figure 5,
Figure 6,
Figure 7 and
Figure 8.
Existing laboratory research has shown that, in the subsoil, a full spectrum of organic soils are present, from the low to high organic soil, which, due to ambiguous and simplified extant classification criteria [
53,
54], in comparison to the withdrawn national standard guidelines [
55], is currently difficult to exactly define [
56,
57]. The organic content of the researched soils ranged from ca. 5% to ca. 85%, with a natural water content ranging from ca. 24% to ca. 418% and bulk density from 1.17 kN/m
3 to 2.25 kN/m
3. The complete results of the index properties of organic soils at the study area at the Rzeszow site are summarized in
Table 1.
4. Evaluation of the Soil Unit Weight Based on Laboratory Test Results
The next step after performing laboratory tests was to search for the relationships between the dependent (soil unit weight γt) and independent variables (natural water content w; organic matter content LOIT).
The one-factor relationship between the soil unit weight and natural water content is presented and described in the formula in
Figure 9a and the organic matter content in
Figure 9b. It has been shown that there is a very strong relationship between the variables included, as evidenced by the high value of the coefficient of determination
R2, for natural water content (0.978) and only slightly lower for organic matter content (0.946). The differences between the values of the soil unit weight, calculated separately for each of the variables, with the value determined in a laboratory, were comparable. In the case of the first variable—natural water content—the max. difference was 21.76%, and for the second variable—soil matter content—the max. difference was 20.46%, in relation to the expected value of the soil unit weight.
Additionally, based on the regression analysis, the relationship between empirically determined values of soil unit weight and leading parameters (natural water content and organic content) was determined and is described by the following formula:
where
γt is the soil unit weight,
LOIT is the organic content and w is the natural water content.
The differences between the values determined on the basis of laboratory tests and the model determined by the regression method, taking into account two variables, did not exceed 2.95 kN/m
3 (22.44%), with the factor
R2 = 0.797. Variable significance studies (
w, LOIT) were conducted and showed that the variable representing the organic content was not significant. After eliminating the insignificant variable, it was found that the differences between the set value and the expected volume of the soil unit weight were much higher at 3.96 kN/m
3 (38.00%), in comparison to the case of the model with two variables. Due to the larger differences and the fact that the model with one explanatory variable was not very representative, the model with two variables was adopted for further analysis. The comparison of the results is presented in
Figure 10.
For the analytical estimation of the soil unit weight, personally determined one and two-factor relationships were used; therefore, it was necessary to determine the value of the relative error in the relationships in relation to the values determined empirically in the laboratory.
The maximum and minimum values of relative error were calculated according to Formula (5):
where
P is number of cases,
p = {1,
…,
P},
dp is the measured value and
yp is the calculated value.
The comparison carried out for the relationship described earlier indicates a much larger value of the relative error of soil unit weight, which was noted for the one-factor relationship based on organic content (
LOIT), while the lowest, comparable values were obtained for one-factor (
w) and two-factor (
w, LOIT) relationships. A summary of the calculated values of relative error (RE) of the soil unit weight is shown in
Table 2.
The results show that evaluation of the soil unit weight, based only on organic content, is not recommended, but two-factors relationships, based on organic content and natural water content, result in better accuracy and a better way to describe the model.
4.1. Standard Regreesion
The most important task of our analyses is to compare the results of standard regression to the neural network approach in the problem of soil unit mass identification. For this reason, a division of the entire set of 135 measurements into two groups (70%, 30%) was assumed. Ninety-four of them were used as basic data to calculate the parameters of the regression fit function. The remaining 41 were used to test the predictions of the soil unit mass from the regression model with measured values. To determine the statistical evaluation during the analysis, the basic data set was randomly selected 250 times.
Four regression models of one variable were used: the two-parameter linear model (6), three-parameter polynomial model (7), two-parameter power model (8) and four-parameter power model (9).
Additionally, the next four regression models of the two variables were used: the three-parameter surface model (10), five-parameter surface model (11,12) and eight-parameter surface model (13).
The goodness of fit was checked using the coefficient of determination (14), mean relative error (15) and mean squared error (16):
where
n is number of cases,
is the measured values,
is the fitted values (predicted) and
is the mean of the measured values.
The comparison of the median of the goodness of fit obtained for the regression of the one variable model is presented in
Table 3. The median is the most resistant statistic datum and is of central importance in robust statistics. The best model obtained in the analyses was the power model F4, using natural water content. It has 1.83% of the median average relative error for testing the model.
The comparison of the
MRE models, included in
Table 3 for laboratory data, is presented using box-and-whisker diagrams in
Figure 11. On each box, the central mark is the median, the edges of the box are the 25th and 75th percentiles, the whiskers extend to the most extreme data points, which are not considered outliers, and outliers are plotted individually.
Generally, we obtained better results using natural water content.
Table 4 presents a comparison of the median of the goodness of fit obtained for the two variables regression models (F5–F8). The best result we obtained was for the F6 regression model.
The comparison of the test results from the
MRE of models are included in
Table 4 for two laboratory parameters (
LOIT,
w) and are presented in
Figure 12. The model F6 had the least outliers.
Figure 13 shows detailed results from the F6 model with two variables (
LOIT,
w). The left-hand plot (
Figure 13a), with the blue line, is related to the base data, and the right-hand plot (
Figure 13b), with the green line, is related to the test data. The coefficients of determination are very similar.
4.2. Artificial Neural Networks Analysis
Next, we applied a neural regression model. In all the examples, standard multi-layer perception with one hidden layer was applied. In this case, the nets have only one element in the output vector (soil unit weight). The number of hidden neurons is obtained from a cross-correlation procedure. In the calculations, five to eight neurons were used in the hidden layers. Exactly the same pattern divided as that for the standard regression was used to learn and test the networks. The comparison of the median of the goodness of fit, obtained for a few architectures, is presented in
Table 5.
The presented results include the median obtained parameters for one element in the input vector. During the calculation of the values, 20 repetitions of the network training were taken into account for each of the 250 pattern divisions. In this way, 1000 results were finally taken into account. Better prediction was obtained, like that in the standard regression for natural water content, in the input vector. A comparison of the
MRE of the nets, included in
Table 5, is presented in
Figure 14.
In this particular case, the network architecture does not affect the accuracy of approximation. Obtained mean relative errors of testing using organic matter content were in the range of 2.73–2.76%. Respectively, the
MRE of testing using natural water content were in the range of 1.65–1.67%. In
Table 6, the comparison of the median result, obtained for the nets with two elements in the input vector, is presented. In that approach, we also obtained a better result, like that in the standard regression with two independent variables. The mean relative errors of testing using two parameters were in the range of 1.40–1.44%.
The comparison test results from the
MRE of models, included in
Table 6 for two laboratory parameters (
LOIT,
w), are presented in
Figure 15. There is no clear difference in the results. The obtained results for the two inputs are a little better, which is the opposite case for the nets with one input. The smallest median average relative error test of the ANN was equal to 1.40%.
Figure 16 shows detailed results for the ANN’s prediction of the soil unit weight using two variables (
LOIT,
w) in the input vector.
Figure 16a shows learning data and
Figure 16b shows the test data. The coefficients of determination are very similar.
Generally, the use of neural networks has allowed the soil unit weight values to be predicted based on laboratory tests with very high accuracy. The ANN regression models are slightly better than in the considered regression models. There is no clear difference in the architecture nets used. The best prediction neural networks were determined based on the lowest average relative error.