Implementation of Machine Learning Algorithms in Spectral Analysis of Surface Waves (SASW) Inversion

: One of the complex processes in spectral analysis of surface waves (SASW) data analysis is the inversion procedure. An initial soil proﬁle needs to be assumed at the beginning of the inversion analysis, which involves calculating the theoretical dispersion curve. If the assumption of the starting soil proﬁle model is not reasonably close, the iteration process might lead to nonconvergence or take too long to be converged. Automating the inversion procedure will allow us to evaluate the soil stiffness properties conveniently and rapidly by means of the SASW method. Multilayer perceptron (MLP), random forest (RF), support vector regression (SVR), and linear regression (LR) algorithms were implemented in order to automate the inversion. For this purpose, the dispersion curves obtained from 50 ﬁeld tests were used as input data for all of the algorithms. The results illustrated that SVR algorithms could potentially be used to estimate the shear wave velocity of soil.


Introduction
The spectral analysis of surface waves (SASW) method is a nondestructive technique used to determine the shear wave velocity of layered material by using the theory of stress wave propagation. This method employs the dispersive characteristics of Rayleigh waves to determine the variation of stiffness with depth. Rayleigh waves propagate along a cylindrical wavefront near the surface of a half-space, and the amplitude of particle motion decays exponentially with depth [1]. Rayleigh waves disseminate two thirds of energy for a vertical load in a homogeneous isotropic half-space [2]. Rayleigh wave phase velocity primarily depends on the material properties, specifically shear wave velocity, compression wave velocity, Poisson's ratio, and mass density to a depth of one wavelength, as shown in Figure 1. In geotechnical engineering, the SASW method has been used extensively to determine the various parameters of soil, such as bearing capacity [3][4][5], small stiffness [6], and pavement subgrade stiffness [7]. One of the complex processes in SASW data analysis is the inversion procedure. Studies related to the inversion procedure can go in many directions due to the procedure's complexness, which includes generation of the theoretical dispersion curve method, simplification of the procedure, problems with multi modes of the dispersion curve, and nonuniqueness of the solution. Nazarian [9] used a modification of the Haskell-Thomson technique [10,11] to generate the theoretical dispersion curve. He employed the INVERT program, which is computer software designed for assuming the initial stiffness profile reported by Ballard [12], to compare the shear wave velocities directly. The differences between the shear wave velocity profiles from Ballard's results and those from the Nazarian study were quite drastic. Therefore, the simplified inversion process based upon scaling the dispersion curve employed by many researchers is not suitable. Hossain and Drnevich [13] found some limitations in Nazarian's technique. The method used to find the roots of the characteristics determinant for a layered system is quite complicated. Another limitation is the trial-anderror process of matching the theoretical and experimental dispersion curves. Hossain and Drnevich applied two methods, using the FORTRAN 77 program to determine the pavement moduli and thicknesses to overcome these difficulties, while they later developed a finite difference technique to analyze the generalized Rayleigh waves in multilayered elastic media. This method is very significant when it comes to determining the theoretical dispersion curve for a pavement system, which leads to a quadratic eigenvalue problem for the pavement system and a linear eigenvalue problem for the geologic model. The second technique comprised using a modification of Knopoff's [14] algorithm to determine the theoretical dispersion curve. Later on, Yuan and Nazarian [15] generated an algorithm based on the energy integral equation to overcome this problem. Moreover, Addo and Robertson [16] developed a unique SASW system requiring no spectrum analyzer. Microcomputerization was used to conduct the entire procedure, and is able to determine the in situ dispersion curve. A program called "SASWFM" was written to automate the inversion process. The shear wave velocity can be obtained directly from the experimental dispersion curve. The results were compared to Seismic Cone Penetration Testing (SCPT) tests and reflected a relatively poor match between the experimental curve and the theoretical dispersion curve in some of the sites. Overall, the microcomputer-based system reduces the time for testing and the cost of equipment. To overcome the difficulties associated with the presence of multiple modes in SASW signals, Zomorodian and Hunaidi [17] introduced a new inversion method based on the maximum vertical flexibility. Meier and Rix [18] proposed the back-calculation neural network to replace the intensive computation of trial and error and the least-squares inversion method. Studies related to SASW inversion are summarized in Table 1.
The complexity of the SASW method in terms of data reduction has encouraged researchers to find a much more straightforward alternative without relinquishing reliability.
Another motivation for much simpler analysis is to make SASW available for all users, be they experienced or beginners. Recently, a guideline for nonexpert users in surface wave acquisition and analysis has been published by Foti et al. [19]. The guideline presented a wide range of concerns related to the surface wave technique, including the nonuniqueness problem of SASW solutions. The discrete layer stiffness matrix method initially developed by Lysmer [20] and Lysmer and Waas [21] The nontranscendental quadratic eigenvalue problem Addo and Robertson [16] Nelder and Mead's simplex method [22] Automated using optimization techniques with a least-squares criterion The number of iterations needs to be increased Yuan and Nazarian [15] Linearized least-squares approximation --Meier and Rix [18] and Williams and Gucunski [23] Back-calculation neural networks - The network required a greater number of more complex mappings for training Zomorodian and Hunaidi [17] The SASW-INVERT program The maximum vertical flexibility coefficient of the layered soil system -In conventional SASW inversion analysis, a soil profile needs to be assumed for the inversion process, which involves calculating the theoretical dispersion curve by means of forward modeling [24]. A misfit between the experimental curve and the theoretical dispersion curve is then iteratively and automatically minimized to a predefined small value. If the assumption of the starting soil profile model is not reasonably close, the iteration process might lead to nonconvergence or take too long to be converged. Therefore, each step in the inversion process requires careful execution. Expert user opinions are often necessary in order to dispose of uncertain interpretation of the shear wave velocity profile and define the criteria for a final profile. To improve the complexity of the inversion process, some machine learning (ML) algorithms were proposed in this study. The main objective of the present study is to accelerate and simplify the inversion process by training the selected ML algorithms to 'understand' how conventional SASW inversion comes out with a shear wave velocity profile from the corresponding experimental dispersion curve, as illustrated in Figure 2.
An artificial neural network (ANN) is a popular ML model which has been adopted in recent years to automate the SASW inversion method [23,[25][26][27]. Williams and Gucunski [23] determined the moduli and thicknesses of a four-layered pavement system through the use of an ANN. Later, Gucunski et al. [27] improved the models to develop five ANN models, with each model determining one single property of the pavement system. Shirazi et al. [25] developed a number of different ANN models to automate the SASW method in pavements. They used several points of the dispersion curve as input and the thicknesses and the elastic moduli of the layers as output. Three ANN software packages (STATISTICA, ANN toolbox by MATLAB, and NeuralSIM) were employed and the models could generate reasonably close elastic modulus for the upper layers. Alimoradi et al. [26] used the results of the SASW tests to train ANN for the classification of shear wave velocity. They conducted nine SASW methods and Downhole Tests (DHT) and determined the unknown nonlinear relationships between SASW results and those obtained by means of the method of DHT, the latter of which have been treated as real values. The results show that the backpropagation neural network could predict the shear wave velocity between wells accurately. Therefore, due to the performance of ANN in SASW inversion by previous researchers, an ANN model will be adopted in this study and compared to other ML algorithms.

Inversion Analysis
The conventional inversion method used in the present study is described in this section. Determination of the Starting Model Parameter (SMP), depth resolution analysis, layer sensitivity analysis, and final inversion analysis are the major steps in the inversion procedure, as shown in Figure 2 (on the right-hand side of the flow diagram). Firstly, the SMP was determined based on the theoretical dispersion curve and the assumption of the parameters. An assumption of the number of layers, as well as of each layer's thickness, was made for the iterative procedure [15]. Subsequently, depth resolution analysis and layer sensitivity analysis refined the starting model in the first step. Several iterations are required for the steps of refining the starting model. These three steps constitute the preliminary inversion analysis. After completing the preliminary inversion analysis, the final inversion analysis, which has been denoted as Inversion Engine in the flow diagram, can be performed.
The starting model is called a preliminary shear wave velocity profile, since shear wave velocity is used as a model parameter for the inversion procedure. There are two steps involved in constructing the preliminary shear wave velocity profile. A preliminary shear wave velocity profile is determined as the first step. Based on the experimental dispersion data distribution, layer thicknesses and the number of layers are calculated for the preliminary profile. In the second step, another profile is determined from the preliminary profile based on the layering determined in the first step. The second profile is used for the inversion analysis.
The measurement of depth resolution evaluates the optimum resolvable depth for a given experimental dispersion curve [24]. The deepest resolvable layer is dependent on the penetration depth or the zone stressed by the surface wave. The penetration depth is influenced by the frequencies generated by the source on the surface and the stiffness structure of the subsurface. Inversion analysis is very reliant on the layering of the starting model. Layering of the subsurface structure is assumed initially for the inversion technique. Consequently, a measure must be provided so as to indicate how good the assumed layering is. Layer sensitivity analysis decides whether or not the assumed layering is suitable. The assumed layering could be improved to provide a proper resolution, depending on the layer sensitivity analysis [24].
The inversion engine, as illustrated in Figure 2, represents the principal flow of the inversion analysis. This engine has two components: one calculates root mean square error (RMSE), and the other updates the model parameters. The parameters include shear wave velocity, Poisson's ratio, compression wave velocity, mass density, and the material damping ratio [28]. Wave velocities are updated for every iteration; however, the parameters, such as Poisson's ratio, mass density, and the material damping ratio, are reasonably assumed, since their effect on phase velocity is considerably small. Compression wave velocity is explicitly related to shear wave velocity and Poisson's ratio. Thus, it can be calculated using the assumed Poisson's ratio and the updated shear wave velocity. Therefore, only shear wave velocity was used as the modal parameter for the inversion analysis. The sensitivity matrix, G, and the change in a model parameter, ∆m, are determined to update the model parameters.
After completing the preliminary inversion analysis, the final inversion analysis can be performed. Iteration of the inversion analysis is continued until the theoretical dispersion curve is similar enough to the experimental dispersion curve [29]. RMSE S measures the misfit between the theoretical curve and the experimental dispersion curve and evaluates the goodness of the shear wave velocity profile. Subscript 'S' is added to the abbreviation of RMSE to distinguish the misfit in dispersion curves from RMSE in ML training. Lower RMSE S are considered the best match between the theoretical and experimental dispersion curves [30]. Since iteration is incorporated into the inversion analysis, there must be a measure with which to quantify the goodness of the estimated model parameters and the predicted shear wave velocity. The prediction error can be defined in several ways, but the maximum likelihood approach for the inversion analysis justifies the use of the L 2 norm, which is a loss function that reflects least-squares errors, as defined in Equation (1). The maximum likelihood approach is one of the procedures used to find the optimum shear wave velocity for an experimental dispersion curve: where e i is the difference between the observed data and the shear wave velocity. The RMSE s in Equation (2) is a representation of the L 2 norm. The RMSE s is independent of the number of datasets and provides the absolute sense of the average misfit: where n is the number of datasets and i represents the variables. Evaluation of a stiffness profile requires an experimentally determined phase velocity dispersion curve which is produced via iterative forward modeling analysis or inversion analysis [15,31]. This inherent problem of inversion analysis requires several tries of an initial guess or a reasonable initial guess. To simplify this inversion process, the experimental dispersion curve and the final shear wave velocity were used to train the ML algorithm, as indicated in Figure 2.

Machine Learning (ML) Algorithms
Machine learning (ML) is an application of artificial intelligence which is able to perform tasks by observing several previous examples and information without being programmed. To predict the shear wave velocity profile from the experimental dispersion curve, an ANN known as MLP was adopted in this study. The predictive performance of the ANN is then compared with other ML algorithms, namely RF, support vector machine (SVM), and LR. An overview of each algorithm, such as their unique features and advantages that motivate their use in this study, is put forth in the present section. Since the dataset from the SASW inversion consists of a set of inputs and outputs, all algorithms chosen are supposed to be supervised ML algorithms. The workflow of a supervised ML model is shown in Figure 3.

Multilayer Perceptron (MLP)
Multilayer perceptron (MLP) is a class of feed-forward ANN. The structure of MLP includes an input layer, an output layer and a hidden layer. This algorithm learns a function, f (·) : R n → R o , by training on a dataset, wherein n is the number of inputs and o is the number of outputs. There can be several hidden layers in the MLP algorithm. Figure 4 shows the MLP network with one hidden layer. The set of neurons {x i |x 1 , x 2 , . . . , x n } is the representation of the input layer, which is raw data collected from the field. The hidden layer transforms each neuron from the input layer with a weighted linear summation w 1 x 1 +w 2 x 2 + . . . + w m x m , followed by a nonlinear activation function g(·) : R → R , such as the hyperbolic tan function. Thereafter, the output receives the values from the last hidden layer.

Random Forest (RF)
Random forest (RF) is an effective model for predictive analysis, as it uses ensemble learning methods for classification and regression [32]. However, fine-tuning of its hyperparameters is required by optimization algorithms for excellent RF modeling. This algorithm has been employed for solving geotechnical engineering problems by, for example, assessing pile drivability [33], the undrained shear strength of soft clays [34], and liquefaction potential [35]. RF regression is based on decision trees for creating predictive models. Each tree is trained on a randomly sampled subset of the input data. If a training dataset is used as input with targets and features in the decision tree, the algorithm will formulate some set of rules. Each tree works individually, with no interaction happening between them. The RF model splits the nodes in each tree, considering a limited number of features, and combines hundreds or thousands of decision trees. Thereafter, the results are predicted based on the average predictive value of each tree. Figure 5 is a representation of the RF decision trees.

Support Vector Machine (SVM)
The support vector machine (SVM) is a supervised ML algorithm that predicts a model significantly with less computational power [36]. It can be used for both regression and classification tasks [37]. SVR is derived from the SVM and is a prediction method that utilizes the principle of ML to optimize prediction precision while preventing overfitting the data automatically. Rather than the conventional empirical risk minimization (ERM) principle, the SVM is trained using the systemic risk minimization (SRM) principle [38]. Only the errors associated with the training dataset are minimized by means of ERM, but SRM simultaneously minimizes the empirical risk and model complexity. In other words, a tradeoff between the consistency of the approximation of the model and the sophistication of the approximation function is implied by the SRM principle. SVR thus prefers smooth models that do not overfit the training data-a requirement for successful unseen (testing) data generalization capabilities.
The principles of SVR are similar to the SVM for classification, but there are a few differences between them. It requires a set of inputs and corresponding outputs to predict the model. The SVM finds a hyperplane in some features which separates the data points into two classes [39]. Many hyperplanes could be chosen to separate the data points. This choice can be made by observing the maximum margin distance between the data points of both classes. Samples on the margin are called support vectors. SVR finds a function f (x) that has the most deviated margin of tolerance (ε) from the actual target value y for all of the training data. The linear function is defined as Equation (3): where x is used to estimate the scaler vector of y by means of the n-dimensional weighting coefficient w, and the constant coefficient is b. The margin of tolerance can be minimized and calculated as: It is assumed that the function f estimates all of the pairs (x i , y i ) with precision through the use of Equation (4). If the data cannot be placed into the margin, the slack variables ξ i can be used to solve the problem via Equation (5): Figure 6 graphically depicts the linear SVR. Generally, four types of kernels are used in the SVR models, namely linear, polynomial, radial basis function (RBF), and sigmoid. The mathematical representation of each kernel is given in Equation (6) [40]: where K X i , X j represents the kernel functions and γ and d denote the kernel width and the power exponent, respectively. Here, C is the kernel parameter. In this study, the RBF was used. The RBF is the most popular kernel type used in SVR because of its localized and finite responses across the entire range of the real x-axis [41].

Linear Regression (LR)
Linear Regression (LR) is the simplest ML algorithm that performs only regression tasks. LR predicts a dependent variable value based on a given independent variable. The algorithm creates a linear relationship between input and output. The model finds the best line of fit based on the minimum error between the predicted and observed values. The hypothesis function for LR is given in Equation (7): where Y is the predicted value, θ 0 is the bias term, θ 1 , . . . , θ n are the model parameters, and x 1 , x 2 , . . . , x n are the feature values. The aforementioned hypothesis can also be represented by Equations (8) and (9): where f i (θ) is the hypothesis function and ∈ i is noise. An intercept term is added by appending a column of 1 s to the features. Regularization is often required in order to prevent overfitting by penalizing models with extreme parameter values. The LR model supports L 1 and L 2 regularization, which are added to the loss function [42]. The L 1 norm can be determined from the sum of the absolute values of the vector, and the L 2 norm can be obtained as the square root of the sum of the squared vector values that are used to minimize the error. One of the main mysteries revealed by deep learning methodology is the concept of benign overfitting: deep neural networks tend to predict well, even with a perfect match for noisy training results. This algorithm was used to predict shear wave velocity at 30 m by means of a 95% confidence interval [43]. Therefore, LR is considered in this study, albeit a simple ML algorithm.

K-Fold Cross-Validation
The primary goal of cross-validation is to avoid the bias of an algorithm in training and testing data selection. The process estimates the performance of ML algorithms based on unseen data or the data not in use during the training. It is called k-fold cross-validation. The parameter k refers to the number of datasets into which a given data sample is to be split, as shown in Figure 7. Each dataset is split into 80% for training and 20% for testing purposes.

Methods and Materials
In the SASW method, Rayleigh waves are generated by any impact source and detected by two receivers before then being recorded by a spectrum analyzer, as shown in Figure 8. Rayleigh wave data were collected and transformed into the frequency domain through the use of a dynamic signal analyzer [44].
The seismic experiments (SASW) were carried out at 50 different locations in Peninsular Malaysia. SASW data are from a broad collection of geological and geotechnical site conditions ranging from recent alluvium, granitic to metasedimentary formations with geotechnical consistency of soft/loose to hard/dense soil conditions. The dispersion curves of all 50 corresponding sites can be found in the appendices. During the tests, both geophones were placed 0.5 m away from the specific points so as to maintain a 1 m distance between said geophones. A rubber hammer, a geological hammer, a sledgehammer and a 10 kg weight were used to hit the soil surface in order to produce a transient vertical impact. Sources were placed at 1, 2, 4, 8, and 10 m from the first geophone, as shown in Figure 9. Thereafter, the data obtained from the field were analyzed using WinSASW 3.2.12. The depth was taken up to 6 m in dataset preparation for ML algorithm training and testing. The range of wavelengths of the dispersion curve needs to include the wavelengths for sampling shallow material and be long enough to penetrate deep layers. One source and receiver's setup are not sufficient to determine phase velocities over a wide range of wavelengths, which is required to evaluate a soil site with a depth of, for example, 30 ft (9 m) to 60 ft (18.3 m). Therefore, several measurement setups comprising different receiver spacings are used for this purpose.  Figure 10 illustrates the summary of the SASW data analysis and the implementation of the ML through the use of the SASW method. The first task in analyzing the field measurements is the construction of an experimental dispersion curve. It is important to unwrap the phase spectrum in order to accomplish this task. An interpretation of the phase spectrum is required so as to unwrap the phase spectrum. The technique is called masking for interpreting the phase spectrum. The unwrapped phase spectrum and the receiver spacing determine the experimental dispersion curve by means of Equation (10): where ϕ is the phase difference between the two receivers for the wave traveling with the frequency of f , and d is the distance between the two receivers. The material properties and thickness of the soil profiles were assumed to obtain the theoretical dispersion curve. The soil layers were assumed to comprise 12 layers of 0.5 m thickness, a bulk density of 1800 kg m −3 , a Poisson's ratio of 0.3333, and a damping ratio of 0.02. The theoretical dispersion curve was generated based on this assumed parameter. Thereafter, the theoretical dispersion curve and the experimental dispersion curve obtained from the field were compared.
The triangular part of Figure 10 is explained in detail in Figure 11. The parameters considered in the development of the ML were the shear wave velocity (V S ), thickness, wavelength, and phase velocity of each layer. The theoretical dispersion curves from 50 tests were taken as input. A total number of 4782 data points was found from the 50 dispersion curves. Subsequently, all of the measurements obtained from the dispersion curve were divided into 12 layers of 6 m depth at a 0.5 m interval, as shown in Figure 11. The wavelengths were converted to an equivalent depth of 1 3 λ R . For every 0.5 m of depth, D, the wavelength, λ R , was calculated using Equation (11): Figure 11. Dataset preparation for ML algorithm.
As shown in Figure 11, the wavelength starting from 1.5-3 m belongs to 0.5-1 m layers of depth. Similarly, the wavelength starting from 3-4.5 m belongs to 1-1.5 m layers of depth, and 16.5-18 m of the wavelength belongs to 5.5-6 m of depth. Afterwards, all of the dispersion curves from the 50 SASW tests were combined and divided into these 12 layers at a 0.5 m interval. For training the datasets, the wavelength and the phase velocity were considered to be input corresponding to each layer. For instance, the wavelengths from 0-1.5 m were taken as input, as well as the shear wave velocities at 0-0.5 m depth. The theoretical dispersion curves, along with the soil profile from the database, were used to train the ML algorithms. The proportion of the training and testing datasets was 80% and 20%, respectively, in the ML algorithms. The contention was to generate the most extensive, theoretically possible input range.
The number of observations obtained from the dispersion curve at different layers is shown in Table 2. The deepest resolvable layer is related to the penetration depth of the surface wave. The penetration depth is controlled by the frequencies generated by the source on the surface and by the stiffness structure of the subsurface. The surface wave needs to propagate deeper with the right range of low frequencies in order to be able to capture the deep layer. Generally, hammer impact sources have a limitation in low-frequency generation in comparison to higher-frequency generation. Therefore, the data points obtained from the field decrease with deeper layers. To evaluate the V S profile from the SASW measurements, dispersion curves must be determined. To measure the performance of the model, the coefficient of determination (R 2 ) and RMSE were used. Equation (12) is employed to determine the RMSE in the ML algorithms: where P i is the predicted shear wave velocity, A i is the shear wave velocity from conventional SASW inversion, and N is the sample size. The following parameters in Table 3 were used in ML for the algorithms.

Distribution of Datasets
The most important step is to determine the underlying distribution of data before applying any ML approach. In an ML context, the input data for processing comprise only numerical values. A histogram is a visual representation of the dataset distribution that shows any outliers or gaps in the data. ML performance depends on the distribution of the datasets. If the data distribution is not normal, then it can be skewed to the left or right or be completely random. Whether or not the shear wave velocity obtained from the field is normally distributed is discussed below.
After analyzing Figure 12a-l, it can be concluded that all of the datasets after 3 m are positively skewed. Moreover, the layers are contaminated with the outliers found after 3 m of depth. Another observation from Table 2 is that the number of observations decreases with the increase of depth. However, a wide range of data points with the increase of depth can be observed in the histogram. Figure 12a     The histogram of the V S at 3-3.5 m depth is emphasized in Figure 12g. These datasets are right or positively skewed. They also contain a significant number of outliers. The majority of the data can be found in the 206.65-251.24 ms −1 range, and the outliers are found from 429.6 to 697.14 ms −1 . The histogram of V S at 3.5-4 m depth shows a similar pattern to the histogram at 3-3.5 m. The distribution of the datasets is also right skewed, as shown in Figure 12h. These datasets are contaminated with outliers detected from 322.59-697.14 ms −1 . Figure 12i Table 2). Real-time data are never usually normally distributed. In skewed data, the tail region may act as an outlier for a statistical model. Moreover, the outliers have an adverse effect on the model performance, especially on the regression-based models.  Figure 13a. Figure 13b shows high similarity between SASW and SVR. However, a significant difference can be noticed over the depth interval of 1-2 m (up to 100 ms −1 ).     SVR makes quite a good prediction for test 9, as analyzed in Figure 17a. Furthermore, SVR predicts the V S profile quite well, except for the second layer (1-1.5 m) and seventh layer (3.5-4 m) for test 10, as illustrated in Figure 17b. It is apparent from these figures that SVR has outstanding performance in comparison to the other algorithms in predicting shear wave velocities. A few drifts have occurred because there were insufficient data points to train in those layers. A comparison of the experimental data and the theoretical dispersion curve corresponding to the SASW profile is provided in the appendices in order to show what impacts this difference has on the effective theoretical dispersion curves that correspond to the shear wave velocity profiles from SASW and ML, and, subsequently, on how well they compare to the experimental data.

Comparative Analysis of Shear Wave Velocity for ML Algorithms
The confidence limit, percentage error, RMSE and R 2 were measured for every layer (0-6 m depth) so as to produce a better evaluation of the ML algorithms. Table 4 shows the confidence limit and the percentage error associated with the RMSE of composite values for the 10 test cases. It is visible that after 3 m depth, the confidence limit is lower than 80% in MLP, which is not reliable for design purposes [45]. The limit is found to be greater than 85% for SVR from 0-6 m depth. A detailed guideline on target confidence levels was provided by Lorig and Stacey [46]. Among all of the algorithms, SVR shows a better prediction, as the confidence limit is greater than 80%. Similarly, RF performs better, except for the last layer of depth. The highest percentage error is less than 15% for SVR, whereas it is greater than 20% for all algorithms.  Figure 12) after 3 m of depth may cause a sudden increase of RMSE in the ML model. R 2 assesses how strong the linear relationship is between the conventional SASW and ML algorithms. Figure 19 represents the R 2 of composite values for the 10 test cases. The R 2 of RF, MLP, SVR, and LR is 0.98, 0.95, 0.97, and 0.95, respectively, for the 0-0.5 m layer, which is the highest value of R 2 . The lowest value is found to be 0.72 at 2-2.5 m depth for RF, 0.55 at 2-2.5 m depth for SVR, 0.52 at 2-2.5 m depth for MLP, and 0.55 at 2-2.5 m for LR. The R 2 value of LR also decreases with depth but the value increases marginally at 4-4.5 m, and then begins to decline again.  The values of mean square error (MSE), RMSE, and R 2 are given in Table 5.  In Figure 20a-l a schematic representation of the R 2 of composite values for the 10 test cases between SVR and SASW can be analyzed. In the first layer (0-0.5 m), the R 2 between SVR and SASW is high. The second layer (0.5-1 m) also reveals a strong association. The association between SVR and SASW is fairly high for the third layer (1-1.5 m) and the fourth layer (1.5-2 m). The R 2 value is 0.85, which is a significant value at a depth of 3-3.5 m, including a few outliers, as seen in Figure 20g. The coefficient of determination increases significantly to 0.92 and 0.93 after 4 m, which is known to represent a good association between the shear wave velocity obtained from SVR and SASW, as seen in Figure 20i,j. At these layers, there are very few data points, and the distribution of the datasets can create a greater RMSE than the other layers.   Although an R 2 close to 1 is a reasonable match, this value alone does not decide if the data points or predictions are biased. The performance also depends on the outliers and regression line, which is close to the points.
It can be concluded that the performance of SVR is better than that of the other algorithms when it comes to predicting the shear wave velocity. The other algorithms have high RMSE after 3 m of depth. Indeed, some outliers can be observed after 3 m of depth. The range of the datasets is broad, but the point that we obtain from SASW is low at this depth, which may affect the ML models.

Conclusions
Automation of SASW inversion was proposed in this study by adopting four ML algorithms, namely MLP, RF, SVR, and LR. This research can potentially contribute by automating the inversion procedure, which will allow us to evaluate the soil stiffness properties conveniently and rapidly. The SVR algorithm shows the lowest range of the RMSE, which indicates a better model performance, while LR shows the highest RMSE among all of the algorithms. After a depth of 3 m, the distribution of data yields RMSE to increase abruptly. The R 2 for SVR is found to be higher than for all algorithms. The confidence limit for MLP and LR after 3 m is lower than 80% and, thus, is considered not reliable for design purposes. The SVR and RF algorithms demonstrate being a better predictor than all algorithms, as the confidence limit is greater than 80% up to 6 m of depth. RF provides instantaneous output analysis, as no hyperparameter tuning is required. MLP provides better performance after 3 m for a few tests only. Among all algorithms, SVR shows the potential to be used as an alternative in estimating the shear wave velocity profile of soil for a given experimental dispersion curve.
For future study, it is recommended to train ML algorithms with more training data. Moreover, a log transformation of the datasets can be performed beforehand so as to remove the skewness of the data. In this study, the empirical relationship considered in dataset preparation was one-third of the wavelength to an equivalent depth. Another relationship, such as half of the wavelength, can be considered in the future. Furthermore, the ability of ML algorithms to simplify the inversion of SASW could provide an efficient comparison with inversion in the global search algorithm.