1. Introduction
Seismic wave velocity is a practical, non-destructive, non-invasive, cost-effective measurement related to the inherent mechanical properties of geomaterials [
1]. However, seismic wave velocity is not used directly in most designs of engineering structures. Developing correlations between seismic wave velocity and different engineering soil properties could facilitate the use of seismic information for designing engineering structures.
Researchers have extensively studied the correlation between seismic wave velocity and different soil properties. Dikmen developed correlations between shear wave velocity (V
s) and uncorrected Standard Penetration Test (SPT-N) values for sandy, silty, and clayey soils [
2]. It was shown that SPT-N and shear wave velocity were strongly correlated but the type of soil had no significant effect on the estimation of V
s. Gautam established correlations between shear wave velocity and uncorrected standard penetration resistance [
3]. This study used 500 measurements on various sand and silt soils. The coefficient of determination for silty and sandy soils was relatively low in comparison to using all soils together. He also compared his results with existing correlations from the literature and showed significant similarities with existing correlations. Hasancebi developed correlations between shear wave velocity and penetration resistance for sandy, clayey, and all soil (i.e., sandy and clay) using regression analysis [
4]. Correlations between shear wave velocity and SPT-N were found to be significant. For sandy soil, the R-value was 0.65, for clayey soil the R-value was 0.75, and the combined data had an R-value of 0.73. Hasancebi also concluded that the blow count was a significant parameter for the correlation, but the soil type had no significant influence. A good correlation was established between S-wave velocity and the degradation factor (G
PMT/G
o), where G
PMT is intermediate strain shear modulus from the PMT and G
o is low strain modulus from S-wave velocity. However, the correlations between SPT-N and field measured S-wave velocity and P-wave velocity were poor. Correlations were also developed between shear wave velocity and cone penetration resistance. Mayne and Rix worked on field clay soil and found empirical correlations between shear wave velocity (V
s) and cone penetration tip resistance (q
c) [
5]. An increasing trend in shear wave velocities with cone penetration resistance was observed with consistency from soft to stiff to hard clay materials. Log regression analysis returned a coefficient of regression of 0.736. Inazaki established correlations between S-wave velocity and SPT-N, bulk density, solidities as the complement of porosity, and mean grain size of surficial unconsolidated sediments [
6]. S-wave velocities were measured in boreholes using the PS suspension logging tool. The results showed that it is possible to express N-values in terms of S-wave when N-value data have good accuracy. The correlation between S-wave velocities and solidities was good but was dependent on lithofacies and depositional age. The data also showed a good relationship between S-wave velocity and density but a weak relation between S-wave velocity and mean grain size. Even though most of the researchers found good correlations between S-wave velocity and N-values, studies by some researchers could not find out good correlations between S-wave velocity and N-values. A large number of researchers worked on developing correlation between shear wave velocity and SPT-N [
7,
8,
9,
10,
11,
12,
13,
14,
15,
16,
17,
18,
19,
20,
21].
Some other researchers attempted to develop correlation between seismic wave velocity and other geotechnical parameters. Evans worked with sand and clay soils to establish correlations between geophysical and geotechnical parameters [
22]. Seismic refraction surveys were performed to collect S-wave and P-wave velocities. Pressure Meter Test (PMT), SPT, Atterberg limit tests, and dry unit weight data were also collected from the Salt River Project (SRP) [
23,
24]. Heureux and Long developed correlations between S-wave velocity, cone penetration parameters, undrained shear strength, and 1-D compression parameters for Norwegian clay [
25]. Data used for this research was collected from 29 sites; in south-eastern and mid-Norway. Regression analyses were performed to establish the correlation between in situ S-wave velocity (V
s) and cone net resistance (q
net), collected from the cone penetration test. The coefficient of determination R
2 was 0.73. The undrained shear strength values obtained from direct shear tests were correlated with V
s. with a regression coefficient (R
2) of 0.91. Their analysis also showed a good correlation between pre-consolidation stress (P
c’) and V
s with an R
2 value of 0.81. Johora developed ANN models to predict geotechnical parameters from S-wave and P-wave velocity separately using laboratory data for compacted clay and sandy clay soil [
26]. The results indicated that P-wave velocity and S-wave velocity were more sensitive to dry density and void ratio than to saturation and water content. The performance of the ANN models to predict geotechnical parameters from soil mix proportion and either P-wave or S-wave velocity was better when multiple geotechnical parameters were predicted at a time. Empirical correlations were developed by Imai et al. between index properties and seismic velocities [
27]. Foti and Lancellotta used velocity data published by Hunter and showed the dependency of porosity with S-wave and P-wave velocity [
28,
29]. Alshameri and Madun showed a direct positive linear correlation exists between seismic wave velocity and cohesion and shear strength for compacted sand-kaolin mixtures. They also attempted to establish correlation between seismic wave velocity and friction angle but found that it is insignificant [
30]. Duan et al. developed correlations between shear wave velocity with vertical effective stress, unit weight, preconsolidation stress and undrained shear strength for clay type soil [
31].
ANN is gaining popularity as a problem-solving tool in the field of civil engineering. Researchers are using ANN to predict concrete compressive strength [
32,
33,
34], ultrasonic pulse velocity [
35], slump of concrete [
36,
37,
38]. Zeh showed the application of ANN to assess the nonlinear behavior of steel structures [
39]. ANN was used to forecast flexure and initial stiffness of beam column joints [
40]. In geotechnical engineering ANN was used to study slope stability [
41], pile analysis [
42,
43], developing correlation between ER and geotechnical parameters [
44], analysis of liquefaction potential [
45,
46]. In transportation engineering researchers applied ANN to develop transportation systems [
47].
ANN can solve complex problems, but the performance depends on the size and accuracy of the data set. Using a big data set can help to train the network efficiently. In many fields big data sets are not always easily available. Researchers are working on developing ANN models using small data sets. Pasini described a particular neural network tool which is capable of handling small data sets and its application to a specific case study [
48]. Feng et al. used deep neural network to predict material defects using small data set [
49].
Literature contains a good number of correlations between seismic wave velocity and blow counts. There are fewer correlations with other important geotechnical parameters, such as water content, dry density, cohesion, angle of friction, saturation, void ratio, etc. More study is necessary to establish the correlation between seismic wave velocity and different types of geotechnical parameters. Many of the existing studies employ conventional regression methods to develop the correlations between geotechnical and seismic wave velocity, even though ANN was used in many fields of civil engineering In this study, multi regression analysis and the ANNs approach were used to develop the relationships between seismic wave velocity and geotechnical parameters or, conversely, to predict geotechnical parameters from seismic wave velocity and other geotechnical parameters using data from the literature. The performance of the ANN and regression analysis was compared. Two different ANN approaches with and without validations were also discussed to handle the small size data set.
  2. Seismic Wave Velocity
Soil allows for the propagation of different types of seismic waves. Waves that deform the material through shear are referred to as shear waves, and those that produce volumetric deformations are referred to as compressional waves. These are often referred to as S-wave and P-wave, respectively. Seismic wave velocity is related to the maximum shear modulus, bulk modulus, Young’s modulus, bulk density, and Poisson’s ratio of the soil [
50].
The longitudinal P-wave and the transverse S-wave velocity in an infinite elastic continuum are related to the elastic properties by
      
      where M (Pa) is the constraint modulus, B (Pa) is the bulk modulus, G (Pa) is the shear modulus, and 
(kg/m
3) is the mass density of the medium. Hence, the propagation velocity increases with the material stiffness and decreases with its mass density (inertia). Velocity of S-waves is always smaller than the velocity of P-waves [
50].
For fluid-filled porous media, the effective bulk modulus is provided by Gassmann [
50].
      
      where B
SK (Pa) is the bulk modulus of the skeleton
,  (Pa) is the bulk modulus of the grains, 
 (Pa) is the bulk modulus of the fluid phase, and φ is the porosity. In the Gassmann model, the shear modulus of the soil, G
eff, (Pa) remains unaffected by the presence of the fluid at low excitation frequencies: 
For partially saturated soils, the mass density of the mixture ρ
mix, changes due to the different densities of the saturating fluids. By ignoring granular effects, fluid substitution can be used to modify the expression for the effective bulk modulus of the soil. For soil with water saturation of S
w, the fluid bulk modulus in Equation (5) given by
      
      where 
 (Pa) is the bulk modulus of the liquid phase and 
 (Pa) is the bulk modulus of the air phase. Small volumes of air produce a large decrease in the modulus of the fluid phase. However, under dynamic loading, differences in inertia, shear stiffness, and bulk compressibility can add further complexity to the analysis.
Seismic wave propagation in granular soil materials is more complicated due to the complex behavior of the solid skeleton and the influence of capillary forces. The skeletons B
SK and G
SK depend on the “strength” of the grain contacts and are therefore dependent upon the applied effective stress. The concept of effective stress for soils at low saturation is still an area of active research because internal forces associated with capillary forces and electrical forces at the grain surface play an important role [
50]. That is why empirical relationships are often necessary to predict seismic wave propagation in the partially saturated particulate medium.
There are numerous methods for measuring seismic wave velocity in the field and the laboratory. In the laboratory, the “time of flight” approach is common. A seismic wave is generated using a source in contact with one end of the sample, the disturbance passes through the soil and is detected by a receiver at the opposite end of the sample. Velocity is calculated by dividing the distance (sample length) by the measured travel time. The soil samples usually consist of remolded soil or a field sample with some degree of disturbance. The frequency of the seismic wave is usually in the 10–100 kHz range. Field measurements of seismic velocity can be performed using surface surveys such as refraction surveys or surface wave analysis. These approaches mitigate the problem of soil disturbance since no samples are required. However, they are less repeatable and have larger uncertainty in measure values. The frequency of the seismic wave is in the 10 Hz to several 100 Hz ranges. It should be noted that field-measured seismic velocities are usually lower than laboratory-measured velocities.
  3. Data Collections
This study uses data from a report on seismic wave velocity (P wave) published by the Engineering Research Institute of Iowa State University, Ames, Iowa [
51]. Field measurement of seismic wave velocity was conducted on highway embankments. The embankments were constructed with three types of soil namely, clay loam, silty clay (two weathering variations, gray and brown), and silty loam. For laboratory measurements, samples were collected from the side slope of the highway embankment adjacent to the field measurement location. One additional soil type was used for the laboratory measurements defined as loess. The types of soils investigated, and their liquid limit and plasticity index are shown in 
Table 1.
Micro-seismic refraction tests were conducted at 34 different locations to measure seismic velocities in the field. The equipment consists of three components: an impact source, a receiving transducer, and a seismic timer. A model 217 Micro-Seismic Timer and a transducer were used for these micro-seismic refraction tests. A tack hammer was used as the impact source on a 5/8-inch diameter steel ball-bearing to transmit the energy into the ground. Seismic measurements were taken along a 2 ft line by moving the receiver in 3-in intervals. A total of 10 first-arrival measurements were collected at each station. The seismic wave velocities were calculated from slope of the distance-time plots. At the midpoint of the seismic line, a standard rubber balloon volumetric density measurement and an in situ water content measurement were performed [
52]. The range of values associated with the field measurements is listed in 
Table 2.
In the laboratory, 35 different soils samples were compacted in 4-inch diameter by 4.58 inch high molds. Standard and modified AASHO compaction tests procedures were followed to prepare the samples. Water content, dry density of all the samples (total 35) were determined in lab. Liquid limit (LL), plasticity index (total 5 samples) were determined on for the different soil types. Seismic velocities were measured on all samples (total 35) using the pulse-transmission method.
  7. Comparison between ANN and Regression Models
Regression analysis is a conventional way to determine the relationship between independent and dependent variables. It determines how strongly the independent variables affect the depended variables. Multiple-linear regression analyses are performed using the same databases used for the ANN models. The performance of the models is determined based on RMSE, MARE, unnormalized ASE and R2. For predicting seismic wave velocity three different data set: lab data, field data and both lab and field data are used. Input parameters for predicting seismic wave velocity models are LL, plasticity index, water content and dry density.
For predicting water content LL, plasticity index, dry density and velocity are used as input parameters. Models are developed for lab, field and both lab and field data sets. For predicting dry density same type of data set and approach are used except the input parameters are LL, plasticity index, water content and seismic wave velocity. The results of regression analysis are shown in 
Table 7.
Table 7 presents the statistics for ANN (without validation) and regression models. To compare ANN models with regression models the unnormalized ASE is calculated using the unnormalized actual and predicted values. Additionally, root mean squared error (RMSE) is also calculated using.
      
 For predicting seismic wave velocity, the regression model using laboratory data showed the higher R2 value and lower MARE but higher. ASE and RMSE than field data models. Combining the field and lab data increased errors and a R2 value in between the models using individual lab and field data. For predicting water content, the regression models based on lab data also showed the best performance The performance of field data and field plus lab data are very close to each other. For predicting dry density, the best performance is also observed for the lab data model. The second-best model is field plus lab data and lowest performance is observed for the field data.
For all the cases, ANN models resulted in better R2 values in comparison to regression analysis. Significant improvements are observed for the case of field data for predicting water content, the R2 increases 22.35%, using field data for predicting dry density the R2 increase 24.18% and for using field & lab data for predicting seismic wave velocity the R2 increase 44.87%. Errors are lower for ANN models in comparison to regression models with some minor exceptions. ANN models showed significantly reduced errors than regression models using with field data for predicting water content RMSE decreases 50%, ASE decreases 130%, MARE decreases 55.84%, for predicting dry density the RMSE decreases 81.48%, ASE decreases 222.67%, 83.33%. The model build with field and lab data for predicting velocity the RMSE decreases 59.83%, ASE decreases 155.21% and MARE decreases 66.49%. So, it appears that ANN shows better accuracy in prediction when compared to regression analysis.