Modeling the Voltage Produced by Ultrasound in Seawater by Stochastic and Artificial Intelligence Methods

Experiments have proved that an electrical signal appears in the ultrasonic cavitation field; its properties are influenced by the ultrasound frequency, the liquid type, and liquid characteristics such as density, viscosity, and surface tension. Still, the features of the signals are not entirely known. Therefore, we present the results on modeling the voltage collected in seawater, in ultrasound cavitation produced by a 20 kHz frequency generator, working at 80 W. Comparisons of the Box–Jenkins approaches, with artificial intelligence methods (GRNN) and hybrid (Wavelet-ARIMA and Wavelet-ANN) are provided, using different goodness of fit indicators. It is shown that the last approach gave the best model.


Introduction
Cavitation is the process of apparition, augmentation, and collapse of the bubbles created around tiny particles in special pressure conditions in a liquid [1]. The discontinuity of the liquids' state characterizes this phenomenon when the pressure experiences a sudden local drop [2]. The cavitation that appears at the ultrasound propagation in fluids is named ultrasound cavitation [3][4][5].
Different researchers have analyzed acoustic cavitation and the phenomena associated with sound propagation in liquids for the last twenty years. Still, the physical processes related to the formation and the bubbles' collapse are not entirely explained. They result from the interaction of the bubbles whose increasing-collapse cycles are not simultaneous, giving birth to the multiple interactions [13,14].
Industrial processes (among which ultrasonic cleaning and sonochemical processing) depend on the acoustic cavitation, which at its turn, is influenced by the conditions of the ultrasonic process [15]. Last period, increasing interest in understanding the ultrasound cavitation mechanism and explaining its effects was noticed [11,12,[16][17][18].
Scientists presented applications of acoustic cavitation. They are based on the analysis of the signal spectrum that appeared in the cavitation field [19]. It has been proved that the signal generated by exciting the transducer by a high frequency (in this case, 20 kHz)

•
The tank where the studied liquid (seawater, here) is introduced (1).

•
The generator of high frequency (HAMEG HM8130, Germany), that worked at 20 kHz in this case (8). • A ceramic transducer (7), excited by the generator. • A pair of electrodes made of copper (13), used to collect the electrical signal produced when the generator works. They can be placed at various distances (from 6 to 61 cm) from the transducer. In the experiment presented here, the distance transducerelectrodes was 30 cm. • An acquisition board (14), connected to a computer (15) for recording the signal. • A cooling fan (11), utilized for preserving a constant temperature of the liquid (in this case 20 • C) during the experiment (given that the ultrasound cavitation is an exogenous phenomenon).

•
The command block (12), used for selecting different powers for the generator regime (80 W, 120 W, or 180 W). Here, 80 W was selected.
When the experiment is performed in stationary liquid, the other components of the experimental setup are not involved. Details on the setup functioning are given in [20,22].
The studied signal is represented in Figure 2.

Methodology
The signal, studied in the time domain, was subject to statistical tests performed at the significance level of 5%. The null hypotheses, the alternatives, and the tests performed are as follows:
The statistical analysis and mathematical modeling were performed in Minitab17, DTREG, and R software (version 4.0.5).
An ARMA(p,q) is a model whose equation is

Methodology
The signal, studied in the time domain, was subject to statistical tests performed at the significance level of 5%. The null hypotheses, the alternatives, and the tests performed are as follows:
The statistical analysis and mathematical modeling were performed in Minitab17, DTREG, and R software (version 4.0.5).
An ARMA(p,q) is a model whose equation is

Methodology
The signal, studied in the time domain, was subject to statistical tests performed at the significance level of 5%. The null hypotheses, the alternatives, and the tests performed are as follows: • Normality, against non-normality-the Shapiro-Wilk test [36]. • Homoskedasticity, against heteroskedasticity-the Levene test [37]. • Randomness, against non-randomness-the runs test [38]. • Stationarity against non-stationarity-KPSS test [39].
The statistical analysis and mathematical modeling were performed in Minitab17, DTREG, and R software (version 4.0.5).

ARMA Models
A discrete-time process is a sequence of real random variables, (X t ; t∈Z). A time process (X t ; t) is called stationary if it satisfies the following conditions: ∀t ∈ Z, M(X t ) < +∞ and is invariant in time (M denotes the expectation), • ∀t, h ∈ Z, Cov(X t , X t+h ) = γ(h) (i.e., the covariance of X t and X t+h depends only on the lag h).
An ARMA(p,q) is a model whose equation is where (ε t ; t ∈ Z) {\displaystyle \varepsilon _{t}} are Gaussian, independent random variables with the same variance and the mean equal to zero. The first p terms are autoregressive, and the last q-moving-average. Given a data series, the model selection depends on the series stationarity or nonstationarity. If a series is not stationary, its stationarity can be reached by taking a difference of the d-th order of the series terms (d ∈ N * ).
ARMA(p,q) is a particular type of autoregressive integrated moving average processes, ARIMA(p,d,q) [40]. The degree of differentiation, d, in ARIMA(p,d,q) processes is chosen for reaching the series stationarity. If the series is stationary, then d = 0, and the model is of ARIMA(p,0,d) = ARMA(p,q) type [40].
The chart of the series autocorrelation function provides information on selecting an ARMA(p,q). If p > q, the values of the autocorrelation function belong to a curve formed by a mixture of exponential decreasing and damped sinusoid functions.
The model validation is done by applying the t-test to the model's coefficients and testing the hypotheses on (ε t ; t ∈ Z) {\displaystyle \varepsilon _{t}} (that should form a white noise). The selection of the best model among the ARMA competitors that fulfill the statistical tests on the coefficients and residual (ε t ; t ∈ Z) {\displaystyle \varepsilon _{t}} is made based on the Akaike (AIC) or Schwarz (SCH) criteria {\displaystyle \varepsilon _{t}} [40]. The lowest the AIC (SCH) is, the better the model is {\displaystyle \varepsilon _{t}}.

Generalized Regression Neural Networks
Generalized Regression Neural Network belongs to the group probabilistic neural networks. The GRNN architecture is presented in Figure 3. It contains four layers-Input, Hidden, Summation, and Output [29]. Given its ability to capture nonlinearities, learning without backpropagation, and the use of nonparametric regression, GRNN was widely utilized for solving classification, regression, and forecast problems that involve continuous variables [29,30].
where ( ;  ) are Gaussian, independent random variables with the same variance the mean equal to zero. The first p terms are autoregressive, and the last q-moving-average. Given a data series, the model selection depends on the series stationarity or n stationarity. If a series is not stationary, its stationarity can be reached by taking a di ence of the d-th order of the series terms ( ∈ * ).
ARMA(p,q) is a particular type of autoregressive integrated moving average cesses, ARIMA(p,d,q) [40]. The degree of differentiation, d, in ARIMA(p,d,q) process chosen for reaching the series stationarity. If the series is stationary, then d = 0, and model is of ARIMA(p,0,d) = ARMA(p,q) type [40].
The chart of the series autocorrelation function provides information on selectin ARMA(p,q). If p > q, the values of the autocorrelation function belong to a curve for by a mixture of exponential decreasing and damped sinusoid functions.
The model validation is done by applying the t-test to the model's coefficients testing the hypotheses on ( ; ∈ ) (that should form a white noise). The selection of best model among the ARMA competitors that fulfill the statistical tests on the coeffici and residual ( ; ∈ ) is made based on the Akaike (AIC) or Schwarz (SCH) criteria The lowest the AIC (SCH) is, the better the model is

Generalized Regression Neural Networks
Generalized Regression Neural Network belongs to the group probabilistic ne networks. The GRNN architecture is presented in Figure 3. It contains four layers-In Hidden, Summation, and Output [29]. Given its ability to capture nonlinearities, learn without backpropagation, and the use of nonparametric regression, GRNN was wi utilized for solving classification, regression, and forecast problems that involve cont ous variables [29,30]. The Input layer is formed of the vector of the recorded values X = ( , … , ) (Fig  3), in this study, the signal's values.
The neurons in the hidden layer apply a kernel function to the distances between training data and the prediction point. The σ values are used for estimating the influ radius. The best σ should be determined when training the network to control the di butions of the kernel function [29]. The most used approach for finding the optimum minimizing the mean square error (MSE). This study employed the conjugate grad algorithm to estimate the best σ for the entire model. The neurons in the hidden layer apply a kernel function to the distances between the training data and the prediction point. The σ values are used for estimating the influence radius. The best σ should be determined when training the network to control the distributions of the kernel function [29]. The most used approach for finding the optimum σ is minimizing the mean square error (MSE). This study employed the conjugate gradient algorithm to estimate the best σ for the entire model.
The RBF and reciprocal kernels have been utilized in this research. As no significant improvement was noticed in the second case, we report the results obtained using the RBF kernel.
After training, the number of neurons in the hidden layer is the same as the number of training samples involved in the modeling [29]. The hidden layer also stores all the variables' values.
The summation layer consists of the D-and S-summation neurons that collect the information from all the neurons from the previous layer. Both neurons sum the values from the hidden layer. The sum is weighted (with the sum of weights equal to 1) in the D-summation neuron [29].
The output layer contains the results of dividing the values stored in each neuron from the previous layer, providing the most probable value for the dependent variable, Y.
An optimization process may also be performed to remove unnecessary neurons. In this case, the optimization criterion was error minimization.
To perform the modeling, the series was divided into a ratio training: test = 80:20 and 70:30. As the best results have been obtained using the first partition, we present this result. The number of iterations was fixed to 5000 (maximum) and 1000 (without improvement). The values of σ were searched in the interval 0.0001-10.

Wavelet-ARIMA Model
A discrete wavelet transform (DWT) is a used to decompose a signal into several sets; each of them is a time series of coefficients that describes the evolution of the signal in time in the corresponding frequency band [41].
The discrete wavelet transform, usually utilized for modeling purposes, is not invariant to translation. To surpass this drawback, the non-decimated wavelet transform (NDWT) may be employed [42].
Considering the signal X = (X 1, . . . , X N ), its decomposition by the à trous wavelet transform results in where the first term represents a smooth version of the initial signal, and the term under the sum is the signal's 'detail' at the scale 2 −j [43,44].
In this article, we used the Wavelet-ARIMA algorithm proposed in [43] and developed by Aminghafari and Poggi [45].
The non-decimated Haar algorithm employs the filter h = (1/2, 1/2), leading to the following equation for the reconstruction of the value X N+1 is where c J,N+1 and w j,N+1 are the approximation and detail coefficients in the NDWT. Thus, for predicting X N+1 , it is necessary to estimate the non-decimated (NDW) c J,N+1 and w j,N+1 , using the equations [45] Therefore, the prediction equation will be written aŝ where • T signifies the transposition, • D N is formed by the dyadic lagged coefficients contained in the vectors w and c T , • α is the vector of the unknown parameters, determined aŝ and M is a fixed integer.
Two other vectors are involved in finding the solution of (5): • c, which contains the coefficients c J,N+1 , • w, which contains the weights in the linear combination of the actual versus the previous variables' values. • Therefore, the procedure has the following stages [45]. • Perform the NDTW using Haar wavelets. • For a sequence from a time series (X 1 , ..., X N ) at a level J, the output will be formed of (J + 1) vectors of size N. • Build the Equation (5). • For this aim, at all the decomposition levels, the maximum number of lagged predictor variables is set. Denote them by (r is defined to be "the order of the AR process fitted on the dyadically downsampled version of W j (or C J ) starting from the last coefficient". • Estimate Equation (5) utilizing the stepwise regression that links X t and D t−1 . • Compute the prediction using Equation (5).

Wavelet-ANN Model
A wavelet-artificial neural network (WANN) is obtained by combining a feed-forward neural network with the wavelet model. Given that WANNs permit an efficient selection of the network input, can capture the nonlinearities, and converge to the global minimum objective function, they have been successfully utilized in various applications from atmospheric sciences, hydrology, and economics for short and long term forecast [46][47][48][49][50][51][52].
In this study, a WANN with the following layers has been utilized ( Figure 4) [31].
and M is a fixed integer. Two other vectors are involved in finding the solution of (5):  c, which contains the coefficients c , ,  w, which contains the weights in the linear combination of the actual versus the previous variables' values.  Therefore, the procedure has the following stages [45].  Perform the NDTW using Haar wavelets.  For a sequence from a time series (X1, ..., XN) at a level J, the output will be formed of (J + 1) vectors of size N.  Build the Equation (5).  For this aim, at all the decomposition levels, the maximum number of lagged predictor variables is set. Denote them by ( ( ) , …, ( ) ).
( ) (j =1, …, J) is defined to be "the order of the AR process fitted on the dyadically downsampled version of Wj (or CJ) starting from the last coefficient".  Estimate Equation (5) utilizing the stepwise regression that links Xt and Dt−1.  Compute the prediction using Equation (5).

Wavelet-ANN model
A wavelet-artificial neural network (WANN) is obtained by combining a feed-forward neural network with the wavelet model. Given that WANNs permit an efficient selection of the network input, can capture the nonlinearities, and converge to the global minimum objective function, they have been successfully utilized in various applications from atmospheric sciences, hydrology, and economics for short and long term forecast [46][47][48][49][50][51][52].
In this study, a WANN with the following layers has been utilized ( Figure 4) [31].  In this study, the maximal overlap discrete wavelet transform (MODWT) and the Morlet function have been utilized. For details, the reader may see in [31]. • Output, which estimates the recorded (target) values.
The number of parameters in the WANN from Figure 4 is the pq + 2q + 1, where p = the lag, and q = the number of nodes in the hidden layer.
For running the algorithm, the maximum decomposition level was determined to be 6. The number of lags was varied from 1 to 12. The best result, reported here, was obtained with lag = 5.

Results
The Shapiro-Wilk and the runs test rejected the normality and randomness hypotheses, respectively. The KPSS and Levene tests did not reject the stationarity and homoskedasticity hypotheses. The autocorrelation function (ACF) chart from Figure 5 confirms the signal's autocorrelation. • Output, which estimates the recorded (target) values.
The number of parameters in the WANN from Figure 4 is the pq + 2q + 1, where p = the lag, and q = the number of nodes in the hidden layer.
For running the algorithm, the maximum decomposition level was determined to be 6. The number of lags was varied from 1 to 12. The best result, reported here, was obtained with lag = 5.

Results
The Shapiro-Wilk and the runs test rejected the normality and randomness hypotheses, respectively. The KPSS and Levene tests did not reject the stationarity and homoskedasticity hypotheses. The autocorrelation function (ACF) chart from Figure 5 confirms the signal's autocorrelation. As the stationarity hypothesis cannot be rejected, and ACF has a decreasing damped sinusoid form, one may search for an ARMA(p,q) model with p > q. The best model found by applying Box-Jenkins methodology was of ARMA(3,1) type, with the coefficients (standard errors, respectively): Therefore, the models' equation is After applying the Student t-test, it was found that the coefficients are significant at 0.001. The Box-Ljung test led to the rejection of the residuals' autocorrelation. The Shapiro-Wilk and Levene test failed to reject the null hypotheses of normality and homoskedasticity, respectively, for the residual series. Thus, model (7) is correct from a statistical viewpoint.
The model was used to predict 48 values. These values are plotted in Figure 6 (blue), after the recorded ones (black), accompanied by the confidence intervals at 0.95 and 0.99 (in grey in Figure 6). The patterns of the signal and forecast are not similar, so the model should be improved. Therefore, the GRNN was built as described in Section 2.2.2.  As the stationarity hypothesis cannot be rejected, and ACF has a decreasing damped sinusoid form, one may search for an ARMA(p,q) model with p > q. The best model found by applying Box-Jenkins methodology was of ARMA(3,1) type, with the coefficients (standard errors, respectively): Therefore, the models' equation is After applying the Student t-test, it was found that the coefficients are significant at 0.001. The Box-Ljung test led to the rejection of the residuals' autocorrelation. The Shapiro-Wilk and Levene test failed to reject the null hypotheses of normality and homoskedasticity, respectively, for the residual series. Thus, model (7) is correct from a statistical viewpoint.
The model was used to predict 48 values. These values are plotted in Figure 6 (blue), after the recorded ones (black), accompanied by the confidence intervals at 0.95 and 0.99 (in grey in Figure 6). The patterns of the signal and forecast are not similar, so the model should be improved. Therefore, the GRNN was built as described in Section 2.2.2.  Using the trained network, the forecast of 48 values (unknown) of the data series i done. The pattern of subseries built by these new values is similar to that of the initia signal ( Figure 8, the red curve), confirming the model performances (discussed in the nex section).
The charts of the recorded and computed values of the signal by using the hybrid algorithms are presented in Figure 9. Comparing Figure 9a,b, one may remark that the Wavelet-ARIMA model generally underestimates the maximum recorded values. The Wavelet-ANN model does not fit the first ten series values well but better fits the rest o the data series. Figure 10 emphasizes the best forecast quality of the Wavelet-ANN model. It contain the chart of the computed values plotted against the recorded values. The points that have the coordinates (actual, predicted) are situated along the first bisectrice of the coordinate axes in both Figure 10a,b. Still, they are more dissipated in Figure 10a, corresponding to the Wavelet-ARIMA model, than in Figure 10b, corresponding to the Wavelet-ANN model.  Using the trained network, the forecast of 48 values (unknown) of the data series is done. The pattern of subseries built by these new values is similar to that of the initial signal ( Figure 8, the red curve), confirming the model performances (discussed in the next section).
The charts of the recorded and computed values of the signal by using the hybrid algorithms are presented in Figure 9. Comparing Figure 9a,b, one may remark that the Wavelet-ARIMA model generally underestimates the maximum recorded values. The Wavelet-ANN model does not fit the first ten series values well but better fits the rest of the data series. Figure 10 emphasizes the best forecast quality of the Wavelet-ANN model. It contains the chart of the computed values plotted against the recorded values. The points that have the coordinates (actual, predicted) are situated along the first bisectrice of the coordinates axes in both Figure 10a,b. Still, they are more dissipated in Figure 10a, corresponding to the Wavelet-ARIMA model, than in Figure 10b, corresponding to the Wavelet-ANN model. Using the trained network, the forecast of 48 values (unknown) of the data series is done. The pattern of subseries built by these new values is similar to that of the initial signal ( Figure 8, the red curve), confirming the model performances (discussed in the next section).
The charts of the recorded and computed values of the signal by using the hybrid algorithms are presented in Figure 9. Comparing Figure 9a,b, one may remark that the Wavelet-ARIMA model generally underestimates the maximum recorded values. The Wavelet-ANN model does not fit the first ten series values well but better fits the rest of the data series. Figure 10 emphasizes the best forecast quality of the Wavelet-ANN model. It contains the chart of the computed values plotted against the recorded values. The points that have the coordinates (actual, predicted) are situated along the first bisectrice of the coordinates axes in both Figure 10a,b. Still, they are more dissipated in Figure 10a, corresponding to the Wavelet-ARIMA model, than in Figure 10b, corresponding to the Wavelet-ANN model.          The charts of the signal and forecast of the next 48 values obtained using the Wavelet-ARIMA and Wavelet-ANN are presented in Figure 11. The charts of the signal and forecast of the next 48 values obtained using the Wavelet-ARIMA and Wavelet-ANN are presented in Figure 11. The Wavelet-ARIMA forecast series (in red, Figure 10a) is sinusoid decreasing. The Wavelet-ANN forecast series (in red, Figure 10b) and the signal have the same shape. Comparison of the forecast obtained by GRNN and Wavelet-ARIMA models show the same shapes and the highest amplitude for the last signal.

Discussion
To compare the models' performances, we analyze the quality indicators. For the ARIMA(3,1) model, the mean standard error (MSE) = 2.4278 and the mean absolute error (MAE) = 0.7747. Mean absolute predicted error (MAPE) could not be computed because there were some values equal to zero, so the formula did not permit the division by zero.
The indicators of the GRNN model's quality are listed in Table 1, together with their values.  The Wavelet-ARIMA forecast series (in red, Figure 10a) is sinusoid decreasing. The Wavelet-ANN forecast series (in red, Figure 10b) and the signal have the same shape. Comparison of the forecast obtained by GRNN and Wavelet-ARIMA models show the same shapes and the highest amplitude for the last signal.

Discussion
To compare the models' performances, we analyze the quality indicators. For the ARIMA(3,1) model, the mean standard error (MSE) = 2.4278 and the mean absolute error (MAE) = 0.7747. Mean absolute predicted error (MAPE) could not be computed because there were some values equal to zero, so the formula did not permit the division by zero.
The indicators of the GRNN model's quality are listed in Table 1, together with their values. On the training set, the variation explained by the (R 2 ) and correlation between actual and predicted errors (r ap ) are very high, the mean standard error (MSE) and mean absolute error (MAE) are small, while the mean absolute predicted error (MAPE) is 18.43%, indicating a good behavior of the algorithm on the training set. On the test set, the values of R 2 and r ap are smaller than on the training set, and MAE and MSE remain low. A significant increase in MAPE is noticed (to 71.577%).
After using the hybrid approaches, the values computed for the goodness of fit indicators on the test sets are Comparing the performances of the artificial intelligence-based models on the test sets (relevant for how well the model learns the data and can use it), the best model is WANN, no matter what indicator is considered. This is confirmed by comparing Figures 8 and 11 that show the forecast for 48 µs.
ARIMA models have been widely used in real-life problems modeling given their ability to capture the linear dynamics of the phenomena. Still, as many processes present nonlinear patterns, the use of ARIMA is not always the best choice (given that it assumes the existence of linear correlation in the data series) [31]. The main advantages of ANNs are their flexibility in modeling nonlinear features of data series, and the absence of the constrain on the a priori specifying the model form [52]. The methods relying on the on wavelet transform can well describe the signal's multiscale features [51]. Combining these algorithms will benefit all their properties.
The results of this study are consistent with the findings of other researchers that emphasized that the use of hybrid approaches may improve the models' performances [31,32,[51][52][53].

Conclusions
The statistical analysis of the electrical signal induced by ultrasound cavitation in seawater showed that the signal is not random, is stationary and homoskedastic. An ARMA model has been proposed and validated from a statistical viewpoint based on these characteristics. As the ARMA model did not provide a good forecast of the following 48 values, alternative models have been proposed and compared to find the best alternative. The first one-the GRNN model-learned the data well, gave good results on the test set, and was successfully employed for forecasting the next (unknown) 48 values of the series. The last two models were of the hybrid type.
The Wavelet-ARIMA model improved the forecast by comparison with the ARMA one; still, R 2 = 82.4829%, and MAPE = 98.81% (high enough). Considering that the lowest the MAPE is, the better the model is, the best model was the Wavelet-ANN one (with R 2 = 96.74%, and MAPE (%) = 42.76).
The study provided for the first time such modeling and comparison for the signal collected in seawater, in the cavitation field, using the experimental setup built by our team and built four alternative models, the hybrid ones being able to provide good forecasts. Similar analyses will be performed to verify that the last approach is the best one for modeling the signals collected in the same or different liquids (diesel, tape water, and transformer oil) to other powers of the ultrasound generator. These liquids were selected because our research team has published some preliminary results and until now good models for some of them were not obtained yet. The most important feature that recommends this hybrid model is its flexibility to capture the series' nonlinearities. This procedure has no limitation related to the liquid type.  Data Availability Statement: Data will be available on request from the authors.

Conflicts of Interest:
The authors declare no conflict of interest.