Using a Data Driven Approach to Predict Waves Generated by Gravity Driven Mass Flows

When colossal gravity-driven mass flows enter a body of water, they may generate waves which can have destructive consequences on coastal areas. A number of empirical equations in the form of power functions of several dimensionless groups have been developed to predict wave characteristics. However, in some complex cases (for instance, when the mass striking the water is made up of varied slide materials), fitting an empirical equation with a fixed form to the experimental data may be problematic. In contrast to previous empirical equations that specified the mathematical operators in advance, we developed a purely data-driven approach which relies on datasets and does not need any assumptions about functional form or physical constraints. Experiments were carried out using Carbopol Ultrez 10 (a viscoplastic polymeric gel) and polymer–water balls. We selected an artificial neural network model as an example of a data-driven approach to predicting wave characteristics. We first validated the model by comparing it with best-fit empirical equations. Then, we applied the proposed model to two scenarios which run into difficulty when modeled using those empirical equations: (i) predicting wave features from subaerial landslide parameters at their initial stage (with the mass beginning to move down the slope) rather than from the parameters at impact; and (ii) predicting waves generated by different slide materials, specifically, viscoplastic slides, granular slides, and viscoplastic–granular mixtures. The method proposed here can easily be updated when new parameters or constraints are introduced into the model.


Introduction
When colossal gravity-driven mass flows enter a body of water, such as a sea, a lake, or a reservoir, they sometimes generate large waves. These events are particularly relevant in coastal areas and mountainous countries. Such waves occurred, for example, in Lituya Bay in 1958 [1] and in Vajont, Italy, in 1963 [2]. Predicting the characteristics of waves induced by subaerial landslides is of great importance for risk management in coastal areas [3].
Researchers have conducted experiments using physical models that try to reproduce the physical processes of impulse waves generated by subaerial landslides. They have simplified water geometry by using 2D flumes or 3D basins and idealized the sliding masses as rigid blocks [4][5][6][7][8], granular solids [9][10][11][12][13][14], or viscoplastic fluids [15,16]. Based on reliable experimental data, a number of empirical or semi-empirical equations have been established, either by combining regression techniques with dimensional analysis [11,[17][18][19] or by a scaling analysis of governing equations [20,21]. Most equations to date have expressed wave characteristics as power functions of several slide parameters on impact, and some have occasionally involved an additive term [22]. One significant issue has emerged from previous research: on many occasions, empirical equations have fit well with their own experimental data, but they then exhibited large deviations from the datasets obtained by other teams, especially when different slide materials were involved [10,15,16,23]. The performances of the different equations on a given dataset remain uncertain. This uncertainty reflects the limitations of empirical equations with a given functional form. Heller and Spinneken (2013) developed generic empirical equations for blocks of various shapes [24]. They also discussed the data discrepancies between using blocks and granular slides. Actually, none of the existing empirical equations can account for all range of materials used in experiments. Applying empirical equations may be difficult when, for instance, the slide material involves different components. A typical example has been Tang et al. (2018), who conducted experiments using blocks, granular slides and mixture of block and granular slides [25]. Taking the viscoplastic-granular mixture as an example, the representative parameters of these two materials are the yield stress and grain diameter, respectively. Due to the current lack of understanding about how these two materials affect the underlying physics of the slide-water interaction, integrating these two parameters into one equation might be problematic if we have presumed a functional form for that equation in advance.
Another key issue is that all the existing empirical equations express wave characteristics from the parameters relating to the sliding masses on impact; none use the parameters related to the initial stage (i.e., when the mass is still on the slope and starts moving). Putting the emphasis of the parameters on impact makes it easier to control the variables and to provide a quantitative analysis; however, for engineering applications, there is a need to predict wave characteristics before the sliding has occurred. For example, in May 2009, a slight slope failure occurred on the Guopu bank of the Laxiwa reservoir, in China. Based on monitoring data, a faulted rock mass with an approximate volume of 3 × 10 7 m 3 showed signs of general displacement [26]. Although there is a very small probability, should the mass drop into the reservoir, it would generate large waves which may well destroy the nearby arch dam ( [27]). In this situation, estimating the characteristics of the potential waves from information on the potential landslide (which is still at rest on the slope) is more than warranted. To study the various physical processes from the initial impact to wave propagation, Heller et al.
(2009) took a holistic approach based on a theoretical analysis and semi-empirical equations [17]. For more complex landslide materials, providing physical constraints on the mathematical operators of prediction equations formulation of empirical equations becomes more challenging.
Using an approach that did not assume the functional form of the equation in advance and relied strictly on the data alone, would be preferable for dealing with both of the above issues. To overcome the limitations of empirical equations, the present study presents a data-driven method, known as an artificial neural network (ANN) method, which has been successfully employed in other fields to cope with complicated parameters in experimental data processing and to develop highly accurate predictive models [28][29][30][31][32][33]. In contrast to empirical equations, in which mathematical dependence was fixed in advance, the ANN method provides an approach in which both the explanatory and explained variables in the data ultimately define their internal relationship without any prior assumptions about the equation's functional form or physical constraints. Moreover, the model can be easily calibrated when new data or parameters become available, which makes it powerful in solving complex problems [34]. Panizzo et al. (2005) compared the ANN method and empirical equations on a simple case (that is, predicting wave characteristics from solid block parameters on impact). The ANN method's predictive capacities were slightly better than those of empirical equations [35]. To the best of our knowledge, no data-driven method has been used to deal with field data. The key advantage of data-driven methods, namely, their high adaptivity to solving complex problems and dealing with complex parameters, was not further investigated.
Using the ANN method, we (i) estimate the wave characteristics from the parameters of a subaerial mass at the initial stage, when it is at rest and starts moving down the slope, and (ii) predict the wave characteristics generated by different slide mass materials (specifically, viscoplastic slides, granular slides, and mixtures of them), all within one model. For each application, we refined the inputs, outputs and network structures of the model. Figure 1 illustrates a physical model of a mass flow moving down a slope and intruding into a body of water. The whole process can be divided into three stages: in stage I, the slide is at rest, in the container box, and then starts moving; in stage II it moves down the slope and reaches the shoreline; in stage III, it enters the body of water and generates waves. We consider a slope with an inclination of θ entering a horizontal flume filled with water. The still-water depth is denoted by h 0 , and the water density is denoted by ρ w . We defined two coordinate systems. The first coordinate system (x, y) is defined with its origin located at the shoreline, with the x-axis proceeding out across the water, stream-wise, and the y-axis pointing directly upward. The second coordinate system (s, l) is defined with the l-axis being along the slope and the s-axis being perpendicular to the slope. A slide mass, with a volume of V I and density of ρ s , is released at a distance l s from the shoreline. The slide's initial shape is idealized as a rectangle with a height of s 0 and length of l 0 . When the sliding mass moves down the slope, its thickness s(l, t) and depth average velocity v s (l, t) vary as a function of l and t, respectively. The volume of the immersed slide is denoted by V s . The free water surface η(x, t) depends on the horizontal coordinate x and time t. The wave created by the incursion of the sliding mass is evaluated quantitatively by its height h and amplitude a. The gravity acceleration is denoted by g.

Experimental Method
Experiments were conducted in a two-dimensional flume at the Swiss Federal Institute of Technology Lausanne (see Figure 2). The experimental facility was devised to mimic snow avalanches penetrating mountain lakes (for further information see [21]). The scale factor between the real world and this facility was approximately 100. The flume consisted of two parts. The first part was a 1.5 m long and 0.12 m wide chute, and it could be tilted at an angle θ ranging from 30 • to 50 • . Its bottom was lined with sandpaper to provide consistent basal friction and its side walls were made of PVC. The second part was a water-filled, transparent glass flume, 2.5 m long, 0.4 m deep, and 0.12 m wide. The slide mass material was initially contained in a box located at the chute entrance, closed off by a 0.4 m high and 0.12 m wide locked gate. The gate was pneumatically activated and could be opened in less than 0.1 s to release the material from the box. The distance from the gate to the shoreline could be varied from 0.5 m to 1.0 m. Once the slide mass material was released, it accelerated energetically, under gravity, and reached velocities as high as 2.5 m/s. Each experiment's initial settings, including slide mass volume V i , initial slide length l 0 , initial slide height s 0 , slope length l s , still-water depth h 0 , and slope angle θ, were recorded before the slide mass material was released. Because of its reduced dimensions, the set-up was also subject to scale effects due to surface tension and viscosity which could have affected wave propagation when the still water depth h 0 < 0.2 m and wave period T < 0.35 s [36]. As h 0 = 0.2 m and 0.38 s < T < 2.24 s in our experiments, we think such scale effects were not significant. We selected Carbopol Ultrez 10 viscoplastic material to mimic cohesive landslides, whose rheological behavior can be described using the Herschel-Bulkley model: where τ c is the yield stress,γ is the shear rate, K is the slide mass consistency, and n is a power-law index that reflects shear thinning (or shear thickening when n > 1). The rheological measurements of Carbopol were conducted using a Bohlin Gemini rheometer equipped with striated parallel plates (40 mm diameter; 1 mm gap size). The values of τ c , K and n in the Herschel-Bulkley equation were fitted to the rheological measurements. Table 1 shows how the rheological parameters of Carbopol depend on its concentration C and the proportion of NaOH to Ultrez 10 in the composite. See [37] for the Carbopol Ultrez 10 preparation procedure.
We used polymer-water balls to represent granular avalanches. These were produced by soaking dry, water-absorbent beads in water for 4-5 h. Both Carbopol and the polymer-water balls have a density very close to that of water (1000 kg·m −3 ), which is also similar to that of the ice (910 kg·m 3 ) mobilized in snow or ice avalanches. Taking advantage of the similar densities of Carbopol and polymer-water balls, we were able to investigate how mixtures of cohesive and granular materials generated waves without having to consider the effects of the densities of the varying proportions of each material in the mixtures. Due to the difficulties in finding materials with matching higher densities, the question of how density and mixture proportions interact during wave formation could not be investigated in the current study. A high-speed camera was placed in front of the shoreline, with its optical axis perpendicular to the sidewall. The camera collected images at a frequency of 200 frames per second, acquiring 600 × 800-pixel images, corresponding to an observation window of 48 × 64 cm 2 . We used a 0.2 × 0.4 m 2 mesh grid to calibrate the raw images and determine the size conversion factor. For each image, we measured (a) the free-water surface when the leading wave reached its maximum height, which helped to deduce the wave amplitudes a m and h m , (b) the velocity v s and thickness s of the sliding mass upon impact, and (c) the volume of the underwater part of the sliding mass V s .

The Artificial Neural Network Method
The ANN method is inspired by how the human brain processes information, and it is constructed from interconnected processing elements called neurons [38] (see Figure 3). ANNs are receiving ever greater attention because of their ability to express complex functions in a flexible form. A typical ANN model consists of three main parts: learning rules, network architecture, and an activation function. The network structure is formed of several layers: one input layer, one output layer, and one or several hidden layers, with each layer containing several neurons. Each of the neurons in a layer is connected to neurons of the adjacent layers via coefficients called weightings.
From a mathematical perspective, the principle of neural networks involves the composition of non-linear functions. Starting with a linear model, considering a dataset z and a vector of inputs x, a linear model for the outputẑ(x) can be constructed consideringẑ(x) = Wx + β, where the weighting matrix W and the bias vector β are obtained by solving an optimization problem that minimizes the overall difference between z andẑ. This process is called model training. Such a simple model may lack the flexibility to represent complex functional mapping and, therefore, intermediate variables (layers) y are introduced: y = σ(W (1) x + β (1) ) and z = W (2) y + β (2) , where σ is a user-specified activation function, like the hyperbolic tangent. The composition of several intermediate layers results in a neural network capable of efficiently representing arbitrarily complex function forms.
In this study, we selected a one-hidden-layer network, as an example, and adopted a back-propagation algorithm to train the network. The algorithm programming was developed using Matlab. Establishing an ANN model consists of three steps: (i) preparing the required data for training the network; (ii) evaluating neural networks with different structures and choosing the optimal one; and (iii) testing the neural network's performance using data which have not been used previously for training the network. The back-propagation artificial neural network algorithm (BP-ANN) consists of two paths: the feed-forwards and the feed-backwards paths. The feed-forwards path is expressed by Equations (2) and (3).
where x i , y j , and Z k represent the input, hidden, and output layers, respectively, W oj and W ok are the bias weights for setting the threshold values, X j and Y k temporarily represent computing results before using the activation function, and F is the activation function applied in the hidden and output layers. For the activation function, we chose the sigmoid function, which ranges between 0 and 1 (see Equation (4)). The activation function is defined on each layer's neurons and is applied to the sum of the weighted inputs and to each neuron's bias to generate the neuron output.
Equation (5) displays the residual function for residual back-propagation training.
where t k is the predefined target value and e k is the residual of each output node. E is the residual between the expected and actual output values. We used a gradient-descent strategy to adjust the weightings, aiming to obtain a minimum E. Equations (6)-(9) express the weightings between the hidden and output layers.
and hence Therefore, the weighting adjustments in the hidden and output link ∆w jk can be expressed by Equation (8).
where η is the learning rate ranging between 0 and 1. With a lower learning rate, the network model will take longer time to converge. Conversely, a higher learning rate may lead to a widely oscillating network. In addition, maintaining a consistent learning rate across the model is preferable. The new weighting w jk is updated by Equation (9), where r is the number of iterations.
w jk (r + 1) = w jk (r) + ∆w jk (r) (9) Similarly, the error gradient in the links between the input and hidden layers can be derived from the partial derivative with respect to w ij . where The new weighting dominates the link between the input layer and hidden layer, δw ij , can be updated as: w ij (r + 1) = w ij (r) + δw ij (r) All the input data were normalized in the range between 0 and 1 using the following equation: where X is the raw data and Y is the normalized data. The initial parameter settings are shown in Table 2.

Results
In Section 4.1, we validate the ANN method by comparing its prediction accuracy against empirical equations, using the experimental data generated by the viscoplastic flow. In Section 4.2, we predict the wave characteristics from the slide mass features at rest and as it started moving (stage I in Figure 1). In Section 4.3, we develop an ANN model which aims to cope with the parameters of a landslide with complex properties, specifically, a mixture of cohesive and granular slide mass materials.
Each model's performance was evaluated by its coefficient of determination (R 2 ), mean square error (MSE), and its sum of squares due to error (SSE), which are expressed as follows: where is the number of series of experimental data, y p,i and y o,i are the predicted and observed data, respectively, andȳ o is the average of observed data.

Model Validation
Most commonly used empirical equations to predict waves generated by landslides involve the following dimensional parameters: Based on a dimensional analysis or a scale analysis, the scaled wave characteristics can be expressed as a function of several dimensionless groups: where X represents the scaled wave characteristics (e.g., the scaled maximum wave amplitude, wave height, wave length, wave period); Π i indicates the explanatory variables selected, where N is the number of explanatory variables.
The predicting equations developed by Zitti et al. [21] were the best fit with our experimental data (see Equation (20)).
Using the same database and explanatory variables as Equation (21), we modeled the experimental data using our ANN method. Thus, the three neurons in the input layer and the two neurons in the output layer were: • Three inputs: Π 1 , Π 2 , and Π 3 • Two outputs: A m and H m Of the 291 samples of Carbopol mass slides in the experimental database, 80% (233 samples) were selected as training data for model construction and 20% (58 samples) were saved as test data for model validation, providing an independent measure of ANN performance after training. Samples for each group were selected randomly.
We used a basic three-layer network structure, namely, one input layer, one hidden layer, and one output layer. To select the optimal number of neurons in the hidden layer, we set a random number of neurons and ran the program, determining their performance by R 2 . Each run was repeated five times and R 2 was calculated by eliminating the maximum and minimum coefficients of determination and averaging the results of the remaining three tests. As shown in Figure 4, the R 2 of both H m and A m reached their maximum values when the hidden layer contained six neurons. Thus, the optimum network for the present study was a three-six-two structure (input-hidden-output). Model training was constrained by the following indicators: the maximum epoch number was initially set to 100; the objective MSE was set to 1 × 10 −4 ; the minimum gradient was set to 1 × 10 −5 ; and the maximum number of validation fails, which represents the number of successive iterations that the validation performance fails to decrease, was initially set to six. Training would stop once one of the indicators mentioned above reached its initial value; for instance, in the present study, training stopped when the number of validation fails reached 6. Figure 5 illustrates the evolution of these indicators (i.e., gradient, validation fails, and MSE) at each epoch until the training is stopped.
In Figure 5c, the MSEs of the training data and the test data were counted separately. The curves of the evolution of the MSE for these three data series were very close, indicating the model's high level of adaptability. The best validation performance was an MSE = 0.00025337 at epoch 43, and the training terminated at epoch 48 as the number of validation fails reached six. The gradient = 0.0011736 at epoch 48. Figure 6 displays a histogram of the residuals between the predicted A m and the observed A m . The probability density of the residuals approximately follows a Gaussian distribution.   In addition, the R 2 of A m was always slightly higher than that of H m , in both models, which may result from measurement errors in the experiments which have been defined in our previous publications [15,16].

Prediction of Wave Characteristics from Initial Slide Parameters
Previously, empirical or semi-empirical equations determined wave characteristics from the mass slide features on impact (illustrated as stage II in Figure 1), and most equations were established in the form of the power-law equations of several dimensionless groups (see Equation (20)). When we predict the wave characteristics from the slide features at stage I, it is difficult to provide physical constraints on the mathematical structure of predictive equations because of the complex physical mechanisms involved in the whole process. In this case, assuming a functional form for the prediction equation in advance might be problematic. Therefore, a data-driven approach that relies strictly on the data rather than on a fixed form equation is preferable, and the ANN method thus fits this requirement. The process involves the following parameters: η(x, t) = η(τ c , K, n, l 0 , s 0 , l s , h 0 , θ, ρ w , ρ s , t, g) The slide mass's rheological parameters include τ c , K, and n. Although they have little effect on the slide mass-water interaction and wave formation [16], they have great effects on the slide mass flowing down the slope. The Pearson correlation coefficients between each pair of these three parameters were all above 0.9 (see Table 3), indicating that all three parameters correlated highly. We therefore selected the yield stress τ c , namely the stress at which the material starts yielding, to represent the rheological parameters. Table 3. The Pearson correlation coefficients between τ c , K, and n.  Figure 8 provides a first insight into how the wave characteristics depend on the rheological properties of the slide mass and on its parameters at the initial stage. It shows experimental data with the yield stress set at τ c = 41 Pa, 62 Pa, and 80 Pa. Overall, the maximum wave amplitude a m increased with rising yield stress τ c and initial slide mass m I , and decreased with slope length l s . = l * h * and ς = s * h * are aspect ratios for the l-axis to the y-axis, and for the s-axis to the y-axis, respectively. The natural choice for defining the typical scale introduced by these ratios was to take the dimensions of the reservoir: l * = l 0 , h * = h 0 , and s * = s 0 . The Bingham number can be expressed as , which is a dimensionless yield stress (relative to the viscous forces). We assumed that the viscoplastic flow reached a near-equilibrium regime, where viscous forces balanced gravity acceleration, and the velocity scale was then v * = (ρ s g sin θ/K) 1/n s 1+1/n

Waves Generated by Viscoplastic-Granular Mixtures
Most studies have mimicked landslides in the real world by using a single slide mass material, including granular slides, viscoplastic materials, or solid blocks. However, many landslides in the natural world are mixtures of granular and viscoplastic materials. In the present study, we conducted experiments using mixtures of polymer-water balls and Carbopol, with the percentage of Carbopol in volume varying symmetrically (0%, 20%, 50%, 80% and 100%). Figure 9 shows raw images, captured by a high-speed camera, of Carbopol, polymer-water balls, and mixtures of them, entering the body of water. These represented landslides with different degrees of cohesion. As shown in Figure 10, larger waves are generated with higher proportions of Carbopol in the mixture, which implies that the slide mass material's composition influenced wave generation. Here, to provide identical criteria for all slide mass materials, we quantified the slide mass properties using a universal dimensionless group named the Impulse product parameter P, which was proposed by [12]: where Π 1 , Π 2 , and Π 3 denote the same parameters as in Equation (20). One issue which should be noted is that the properties of granular slides are usually represented by their grain diameters, whereas the rheological behavior of viscoplastic materials is commonly described using yield stress. It is difficult to integrate these two parameters into one equation in the form of a power-law equation. To overcome this limitation and provide a compatible model for these parameters, we applied the ANN method so as to avoid assuming the functional form of a prediction equation. Here, we predicted the wave characteristics from the mixture's parameters on impact. As highlighted above, the dimensionless parameters in modeling experiments with a single material commonly involve the slide Froude number Π 1 , relative slide mass Π 2 , and the relative slide thickness Π 3 . To quantify the properties of mixed viscoplastic and granular slides, we introduced the following dimensionless groups: the Bingham number Bi= τ c ρ s gs 0 sin θ , which represents the rheological properties of a cohesive material; the scaled diameter of the granular slide mass D s = d g h 0 , where d g is the diameter of a granular particle; the volume ratio of the viscoplastic material in the , where V s is the volume of the viscoplastic slide mass and V g is the volume of the granular slides; and the density ratio between the two materials R ρ = ρ s ρ g , which is a constant in the present study. Hence, the input layer contained six neurons {Π 1 , Π 2 , Π 3 , Bi, D s , and R V }, and the output layer contained again {A m and H m }. Using the same method presented in Section 4.1, the number of hidden neurons was determined, and the network's optimum structure was six-eight-two. The R 2 , MSE, and SSE of A m were 0.9325, 0.0072, and 0.2172, respectively. The R 2 , MSE, and SSE of H m were 0.9173, 0.00178, and 0.6154, respectively. As R 2 of both A m and H m were greater than 0.8, the model can be considered as valid. The predicted A m and H m are illustrated against the experimental data in Figure 11.

Model Adaptability
In Sections 4.2 and 4.3, we presented two applications which were difficult to model using empirical equations with a fixed functional form:

•
One application was predicting wave characteristics from slide mass features at the initial stage I. When doing this, it is difficult to provide physical constraints on the mathematical structure of predictive equations because of the complex physical mechanisms involved in the whole process. In this case, assuming a functional form for the predictive equation in advance might be problematic. • Another application was predicting waves generated by viscoplastic-granular mixtures. The properties of granular slides are usually represented by their grain diameters, whereas the rheological behaviors of viscoplastic materials are commonly described using yield stress. It is difficult to integrate these two parameters into one equation in the form of a power-law equation.
Both these scenarios can easily be adapted using the ANN method's high prediction accuracy (see Table 4). This clearly demonstrates the advantage of using a purely data-driven method in terms of model adaptability (and this is not limited to an ANN method). In contrast to equations with fixed formulae, the ANN method has no external constraints, making it a scalable open system. In addition, it has the ability to self-update and is highly adaptable when new parameters become available or fresh constraints appear (they are not limited to the two scenarios presented in this study). With more informative, richer datasets, stronger correlations can be built from the input layer to the output layer.  Table 4 displays the coefficient of determination R 2 , mean square error (MSE), and sum of squares due to error (SSE) values for each of the models presented in Section 4. The following features are worth noting:

•
Compared with the empirical equations based on regression techniques, the ANN model gives more precise predictions. Using the same explanatory variables, the coefficient of determination R 2 improved from 0.9214 to 0.9682 for A m , and from 0.9062 to 0.9479 for H m . Of course, the improvement in prediction accuracy is not large.

•
The prediction precision for A m was greater than for H m in predictions made with empirical equations and with the ANN models. This may be because the experimental measurement errors of wave heights h m were larger than those for wave amplitudes a m . Prediction precision not only depends on the prediction performance of the model selected, but it also relies on experimental accuracy.

•
The predictions of wave features from the parameters at impact were better than the predictions from the parameters at the initial stage. Also, prediction precision decreased when the dataset involved combinations of different slide mass materials. Thus, prediction precision decreased as experimental complexity increased and more parameters were involved.

Multicollinearity
Multicollinearity is a phenomenon where one explanatory variable in a multiple regression model can be linearly predicted from the others with a substantial degree of accuracy. This may lead to the problem that the multiple regression's coefficient estimates change erratically in response to small changes in the model. The natural logarithmic form of empirical equation (Equation (20)) can be written as: The coefficients ln δ, α, β, and γ were estimated using the least squares (linear regression) method based on experimental data. As length [L] was scaled by the still-water depth h 0 , h 0 appears in the three aggregated parameters Π 1 , Π 2 ,and Π 3 , and specifically, they are correlated with h −1/2 0 , h −1 0 , and h −2 0 , respectively. The high correlations among explanatory variables may result in multicollinearity during the linear regression. However, to date, none of the studies using empirical equations has discussed multicollinearity.
To estimate the correlations between each pair of explanatory variables, we calculated their Pearson correlation coefficients r. As illustrated in Figure 12, the Pearson correlation coefficient r between Π 1 and Π 2 is relatively high (r = 0.69), however, it is still under the upper limit of 0.8. Furthermore, to determine how influential the water depth h 0 was in wave generation, we determined the sensitivity of the maximum wave amplitude a m to a ±20% change in each of the following parameters (taken in isolation from the others): slide volume on impact V s , slide velocity on impact v s , slide thickness s and still water depth h 0 . We obtained similar results to those obtained by [17]: the a m variations due to changes in these parameters were smaller than 20%, and a m was more sensitive to v s and V s rather than h 0 . We may therefore consider that the multicollinearity lies within an acceptable range.

Limitations
The present study explored the possibility of extracting models purely from data, however, data-driven models may suffer from a lack of interpretability, e.g., the difficulty in explaining causal relationships between the data, the discrepancy, and the corresponding prediction. The use of deep learning strategies and vast amounts of data in the inference process exacerbate this issue. In addition, when ANN produces a solution, it does not give any clue as to why and how. This reduces trust in the network relevance because of the lack of visual links between outputs, inputs and neurons.

Conclusions
This study applied an artificial neural network (ANN) method-one of the most commonly used machine learning methods-to predict the characteristics of waves generated by gravity-driven slide masses. Laboratory experiments were conducted using a viscoplastic material (Carbopol), a granular material (polymer-water balls), and mixtures of them. After validating the ANN model by comparing its prediction accuracy with that of empirical equations, we applied the model to two scenarios: (i) predicting wave characteristics from the parameters of landslides initially at rest on the slope and (ii) integrating the parameters of different categories of slide mass material into one model, i.e., a Bingham number for the viscoplastic material and the grain diameter for the granular material. For each scenario, the inputs, outputs and network structures of the ANN model were refined. In the first scenario, the R 2 for the scaled maximum wave height H m and scaled maximum wave amplitude A m were 0.8983 and 0.8497, respectively, and in the second scenario, the R 2 for H m and A m were 0.9325 and 0.9173, respectively. As a purely data-driven method, this ANN method was easy to adapt when new parameters were included or fresh constraints occurred.