Using Autoregressive with Exogenous Input Models to Study Pulsatile Flows

: The content of this paper shows the ﬁrst outcomes of a supplementary method to simulate the behavior of a simple design formed by two rectangular leaﬂets under a pulsatile ﬂow condition. These problems are commonly handled by using Fluid-Structure Interaction (FSI) simulations; however, one of its main limitations are the high computational cost required to conduct short time simulations and the vast number of parameter adjustments to simulate different scenarios. In order to overcome these disadvantages, we propose a system identiﬁcation method with hereditary computation—AutoRegressive with eXogenous (ARX) input method—to train a model with FSI simulation outcomes and then use this model to simulate the outputs that are commonly measured from this kind of simulation, such as the pressure difference and the opening area of the leaﬂets. Numerical results of the presented methodology show that our model is able to follow the trend with signiﬁcant agreement with the FSI results, with an average correlation coefﬁcient R of R tr = 90.14% and R tr = 92.27% in training; whereas for validation, the average R is R val = 93.31% and R val = 83.08% for opening area and pressure difference, respectively. The system identiﬁcation model is efﬁciently capable of estimating the outputs of the FSI approach; however, it is not intended to substitute FSI simulations, but to complement them when the requirement is to conduct many repetitions of the phenomena with similar conditions.


Introduction
One of the main characteristics of dynamical systems is their dependency on past information to forecast future information [1]. System Identification (SI) methods characterize dynamic systems by employing mathematical-based models using the measured data from these systems. One of the most common techniques in this area is the AutoRegressive with eXogenous (ARX) input model. Its reliability to replicate a broad spectrum of scenarios and its parametrization make this model an ideal approach to work with single-input single-output systems. The ARX model works as a black-box containing a set of parameters (in this case, coefficients) needed to construct a transfer function, which describes the behavior of the dynamic system. Regarding its applications, we can find examples for different fields such as gas conditioning for cement plants [2], thermal load prediction for buildings [3], and stock market forecasting [4]. Some other mechanical and electronic applications include the estimation of driving behavior [5,6], gearbox failure detection [7], vibration analysis for manipulators during grinding operations [8], controlling the position of a steel ball in a magnetic levitation system [9], an inverted pendulum [10], and the estimation of the state of charge for The content of this paper is as follows: In Section 2, the basic parameters to configure the FSI simulations and their respective results are given. Section 3 shows the SI method that uses the input and output signals of the FSI simulation to train the models that estimate each output of the system. Next, Section 4 presents the results of our model in comparison with the FSI approach. Then, with the parameters retrieved from the trained model, we use a new validation dataset to estimate both outputs and then compare them with its respective FSI simulations. Finally, Sections 5 and 6 present the discussion and conclusions of the presented methodology.

FSI Simulation of a Simplified 2D Leaflet Structure
The academic problem studied in this paper consists of using the data generated by a set of FSI simulations of a moving fluid across a 2D channel with a pair of leaflets fixed at the center and then creating an ARX model with these data. In this section, we explain how the experimental setup found in Ledesma-Alonso [42] was adapted for our simulations to be later used in the SI model.
The design presented in Figure 1a shows the leaflets and channel layout. The top and bottom boundary walls of the channel are separated by a distance of 2 h, where h = 15 mm; both walls present a non-slip boundary condition, and the length of the channel is 20 h. The experimental setup considered a 3D environment where the width of the channel was w = 50 mm; however, this simplified layout neglected this parameter to reduce computational cost, hence a 2D version of the problem has been studied. The location of the fixed tip of the leaflets was at the center of the channel (at x = 0 mm). This layout also considered the two experimental points located at ±5 h horizontal distance from the center, in order to measure the pressures during all the simulation time. (a) Layout of the system: h represents the semi-height of the channel. P 0 and P 1 are the points used to measure pressure. The original setup can be found in [42]. (b) The normalized velocity curve U * f (t * ) used for the Fluid-Structure Interaction (FSI) simulations. (b) represents the fast Fourier transform of the signal with its representative working frequencies.
Regarding the parameters used for the simulations, they consist of two different groups: the fluid and the solid domain. For the fluid domain (Ω f ), the parameters of the input signal or input velocity are the magnitude (U max ), stroke volume (V), frequency ( f ), and the fluid parameters are the density (ρ f ), and dynamic viscosity (µ). For the solid domain (Ω s ), the parameters are the leaflet thickness (d) and length (l), material density (ρ s ), elastic modulus (E), and Poisson ratio (ν). One of the materials chosen for the leaflets in the experimental case was silicone rubber; hence, the material properties applied to the solid bodies within the simulation are related to this material, which will be taken as the reference case. Under the circumstances of the proposed system, its dimensions are much larger than the size of red blood cells and, as a consequence, the working fluid can be considered as Newtonian [42].
Simulations were conducted using an HP workstation with an Intel Xeon Processor running at 3.40 GHz and 16 GB of RAM. The software selected was LS-DYNA (Livermore Software Technology Corporation, Livermore, CA), which for previous tests has proven its capabilities to handle CFD and FSI simulations by using an ALE approach [43][44][45]. The output variables considered for these tests are the leaflets opening area (S v ), defined as the vertical distance between the tips of the leaflets, and the pressure difference (∆P), defined as P 1 − P 0 the pressure difference between points 1 and 0 (see Figure 1a). Then, our first approach of an FSI simulation was conducted using the parameters listed in Table 1. The time evolution of the phenomenon, for which t is the independent variable, takes place along a period T, which is equivalent to T = 1/ f . For this first scenario, the input velocity curve was a pulsatile flow curve divided into forward time (t f ) and backward time (t b ) stages (see Figure 1b), with T = t f + t b . Using T as the characteristic time scale to normalize time, thus making t * = t/T, the forward stage occurs during t * = [0.05, 0.49], whereas the backward stage happens during t * = (0. 49,1] and also occurs during a short time before the forward stage begins, i.e., t * = [0, 0.05). Hence, t f is constituted as 0.44T, and t b stage is the remaining time of the period T. This signal is specifically designed in such a way to behave similarly to the configuration of the experimental study [42]. Therefore, the stroke volume V used in this test follows the next formula: where the flow rate (Ġ) from the normalized velocity consists of a multiplication of the integral over a time of U * f (t * ) times S, which is the cross-sectional area of the channel given by S = 2hw. The parameter U * f (t * ) is the normalized velocity curve (see Figure 1b) defined as a piecewise function of t * : The solution of the velocity integral over time, given in Equation (2), from 0 to 1, as defined in Equation (1), is 0.28219, which is a constant for all further numerical simulations. Hence, we end up having the following formula: Finally, the subplot of Figure 1b shows the fast Fourier transform of the normalized velocity with the most significant frequencies acting on this signal. This representation lets us know that most of the power of the signal in Equation (2) is transmitted, not along a unique frequency, but into more values lower than 10 Hz; the rest of the frequencies above this value are nearly zero.

FSI Simulation Results and Parametric Study
Results obtained in the FSI simulations are shown in Figure 2 by representing nine stages describing the following situations:  The design of the input curve avoids a back-flow behavior in the leaflets; consequently, this curve was designed to only provide cases with correct functionality. Despite the slight differences produced by the 2D simplification, previous results on the FSI simulation of leaflets [15] showed a qualitative agreement with the aforementioned experimental case [42]. Finally, in order to observe the behavior of the leaflets under different conditions, a parametric study was carried out. The variations consisted of changing the elastic modulus (E) in the solid domain and the period T of the input signal U * f (t * ) in the fluid domain. The proposal was to have data of 45 different combinations, retrieved from five variations in E and nine variations in T. All combinations shared the same values of ρ f , µ, ν, ρ s , d, and l and possessed almost the same volume V. The variations for E were chosen considering three different elastic conditions [46]. The first one is for situations where the leaflet material is particularly hard, with S v /S < 0.5. The second considers the opposite, when the leaflet material is extremely soft, resulting in S v /S > 0.7. The last condition considers an intermediate elastic behavior, leading to 0.5 ≤ S v /S ≤ 0.7. From a previos work [42], we considered an initial E = 2.15 MPa, two smaller values E = 0.1 MPa and E = 1 MPa, and two larger values E = 3.16 MPa and E = 10 MPa, in order to represent the three aforementioned conditions. More information regarding the mesh-size selection for both the fluid and solid domains, as well as a structural analysis for the leaflet model to obtain the initial conditions for the FSI simulations can be found in the Supplementary Material.
Regarding the variations in T, we initially tested one of the experimental periods proposed where T = 1.502 s. Additionally, two more values were chosen from here: T = 1 s and T = 3.0 s. Finally, in order to have nine variations, intermediate values were selected. With these forty-five combinations, we conducted their respective FSI simulations, and the results showed the same behavior as the curves presented in Figure 2. The frequencies, and the corresponding periods, selected for this study are presented in Table 2.

System Identification Method
The main contribution of this paper is to present a complementary method that can help researchers obtain results as accurate as possible as if they were doing an FSI simulation. It has to be considered that the previous simulations are required to obtain the coefficients needed to estimate a specific output parameter. Once the model has been identified, the advantages of this method lie in skipping the many steps required to configure the simulation, namely geometry, boundary conditions, mesh quality, and topology, and mainly to overcome the high computational cost required to accomplish short simulation times and still obtain accurate results.
In general, these systems allow generating a model from the system input/output information. The model selected for these cases is an ARX, which constructs polynomial functions based on a set of coefficients according to the system order n.

ARX Model
The goal of this model is to obtain the θ vector, which contains the parameters (coefficients) that represent the relation between the input and output of the system.
Hence, the process to calculate the optimal parameter vector can apply the hereditary identification principle by using the experimental expectation [47] instead of the mathematical one [48]. This principle keeps the criterion minimization task quadratic; as a result, it avoids the use of non-linear optimization techniques, such as the gradient-based or Gauss-Newton techniques to derivate and reduce the order of the coefficients. For this case, the distance (the difference) among the estimated trajectories (behavior of the signal after every iteration) is minimized. Hence, we could present: as the general form to obtain the coefficients θ τ , y τ as our main input/output signal using the experimental expectation over [1, . . . , t], and θ • = {θ τ , τ = 1, . . . , t}. Now, in the case of the ARX model, we considered y τ as the output, u τ as the exogenous input, andŷ t τ as the optimal past estimator of the behavior at instant t for data over [1, t], and it is calculated as follows: Then, the regression vector that contains the coefficients to describe the behavior of the system is defined by: Then, the parameter estimation is done by hereditary computation; therefore, it is required to know the experimental autocorrelation matrix of ϕ t−1 τ and the inter-correlation vector of ϕ t−1 τ and y τ . Hence, to compute the system parameters θ t at instant t, it is necessary to keep in memory all the past estimationsŷ t−1 τ|τ−1 , · · · ,ŷ t−n τ|τ−1 . Therefore, if we were not using hereditary computation, the complexity and computational cost would be higher. The general algorithm used to estimate the θ t vector is: 1. Initialization. As data before t = 1 are unknown, the projection over it is zero (division by a covariance a priori infinite). It is then possible to think of it as zero: 2. At instant t − 1, we already have data from the previous optimized estimations.
new data y t , u t are available in order to update the experimental auto-correlation matrix and the inter-correlation vector: 4. The coefficient parameter vector at instant t is defined as θ t = [a n , b n ] T , and it is obtained by inverting the autocorrelation matrix: 5. Hereditary part: With the obtained parameters, by using Equation (6), it is then possible to compute the new estimationŷ t−1 τ|τ−1 based on the previous estimationŷ t−i τ , which was computed at the preceding stage by employing the model ∀τ = 1, · · · , t.
The algorithm describes the process to train only one signal and simulate one output. Hence, in order to obtain each set of coefficients for all forty-five combinations, Figure 3 describes the general process. In this case, the input signal U E,T in Figure 3a is equivalent to the input used for the elastic modulus E and time T of the aforementioned ranges of the previous section. With this, we obtained two different outputs (S v or ∆P) per input from the simulations. The ARX model proposed is a single-input single-output system; hence, each of the forty-five combinations has two ARX models, one per output.  The validation process simulates the output from new input signals for E * and T * representing the elastic modulus and periods that were not initially considered in the parametric study.
Secondly, Figure 3b shows the identification part (training), which uses the data obtained in Step (a) to calculate the θ E,T vector for all combinations. During the estimation, the parameter vector is updated after every iteration and uses the information from previous estimations to reduce the error between the original output and the estimated one. Then, for the combination E, T, the input U E,T is equivalent to the exogenous input u τ and ∆P E,T or Sv E,T is equivalent to y τ of Equation (6), respectively. Finally, in Figure 3c, the validation part simulates both outputs for a new set of conditions-different from the 45 proposed-by only employing these new input signals and a new set of coefficients, which is calculated by doing a bilinear interpolation with the previous θ E,T vectors.

ARX Model Results
In order to evaluate the quality of the algorithm addressed in this work, we used a multiple correlation coefficient R. This percentage indicates how wellŷ t τ explains y τ [48]: After the training part, we end up having forty-five θ E,T vectors per output, containing the coefficients' information. In order to construct the transfer functions, these coefficients follow the next expression (presented in Fourier transform structure) defined in [48].
where B(s) is the set of coefficients for the numerator and A(s) is the set of coefficients for the denominator. Each output has its own model: for S v , we employed A(s) and B(s); whereas for ∆P, we used C(s) and D(s) to represent their respective coefficients. The models have a system order n a and n c for the a s and c s of the denominator and n b and n d for the b s and d s of the numerator of each equation. These equations were tested for n = [1, 2, 3, 4], and we found the best goodness of fit in ∆P when n c = 2 and n d = 1; whereas for S v , we obtained the best results when n a = 3 and n b = 2. Hence, according to Equation (13), we have the next structure for the transfer functions of each output using the hereditary ARX computation: The coefficients found during the identification part for all the elastic moduli proposed are presented as field maps in Figure 4a,b for ∆P and S v , respectively. In both outputs, we observe that the zone around E = 1 MPa and T = 2.5 s presents the highest or the lowest values, depending on the particular behavior of each coefficient. In the case of ∆P, we see a change close to a local extrema (E = 10 MPa and T = 2.50 s) for the c s coefficients, whereas for the d s, we appreciate this change in the area close to E = 1 MPa and T = 1 s. On the other hand, the coefficients for S v seem to have a constant behavior in all a s and b s coefficients surrounding the zone of E = 1 MPa and T = 2.5 s. As mentioned before, the quality of the algorithm was measured using the multiple correlation function R; for that, R values of all combinations are shown in Table 3 (for more details of the entire set of reconstructed signals and the coefficients obtained for the SI model, see the Supplementary Material). In Figure 5a, the curves correspond to the reconstruction of ∆P following the structure of Equation (14), and by comparing them with the experimental curves, their behavior follows the same trend. Conversely, Figure 5b shows that S v is less accurate, since only the opening and closing moments were recovered; the reconstruction follows Equation (15).
Regarding the computational time, the training process for each period took 13 s approximately; hence, obtaining the results of the nine periods for both outputs, for one elastic modulus, took less than 4 min to finish the process. Considering this time, our ARX model calculated all 45 θ E,T vectors for the training part of both outputs in about 20 min. Thereby, the execution time using the ARX model represents only 0.0465% of the total time (716.5 h) consumed by doing the forty-five FSI simulations of the parametric study. This means that once our model is trained, we can obtain estimations to complement further cases with it instead of doing more FSI simulations, which would require more computational time.  The dotted lines indicate the nine periods T and the five elastic moduli E. In (a), we have the maps for the ∆P coefficients and, in (b), the S v coefficients. According to Equations (14) and (15), the values for c 2 in ∆P and a 3 in S v are always one.

ARX Model Validation
One of the main goals of using an ARX model for this purpose is to have the possibility to use the parameter vectors found in the training part for estimating/simulating external conditions to the ones employed in the parametric study and verify the quality of our model without spending more time on FSI simulations. The simulation parameters considered for these changes are the same as the ones in the parametric study except for E and T. These parameters now consider the following options: E = The validation process forms the transfer functions of the aforementioned desired values by using a bilinear interpolation with the coefficients found in Figure 4. Finally, Table 4 presents the interpolated values for these new cases. In (a), the pressure difference (∆P) and, in (b), the normalized opening area (S v ). The average accuracy obtained for reconstructing the nine periods was R = 96.14% and R = 89.35% for ∆P and S v respectively. The average goodness of fit of the five E s is listed in Table 5. In order to compare the signals obtained by our ARX model, we previously conducted the FSI simulations of these new conditions used for validation. The results seen in Figure 6 show that some simulations were not able to finish the calculation process; this was because they were unable to find equilibrium or convergence. For these reasons, estimation of the outputs was merely conducted until the moment before the simulation stopped. The accuracy of the validation results is shown Table 5, where the average value of accuracy was R val = 93.31% and R val = 83.08% for S v and ∆P, respectively.
One limitation of this method is that it only works for stationary signals; hence, when dealing with transitory behaviors, it was not able to follow the trend expected efficiently. For example, in the opening area of Figure 6, we can see at the beginning of the forward stage (t f ) how the leaflets began to open up to the highest position due to an inertial momentum before it went back to the value where it got stabilized by the flow. On the contrary, the pressure difference output was better estimated by our model because it did not contain a transitory behavior.  This validation process showed that our model could be a promising support for the FSI simulations when dealing with different situations in a faster and still accurate way. Besides, it could eliminate the necessity of conducting several simulations before we can get a vast amount of information by making small changes in the input signal and wait for very long times to see how these new conditions behave. We remark that our model follows trends, and it may require complementary techniques such as non-linear models to fully estimate the outputs.

Test for Gaussianity and Linearity
According to the results, our model proved to have better performance when dealing with the pressure difference than with the opening area. Hence, as a final step, we analyzed the prediction error vectorỹ t = y t −ŷ t obtained in S v to verify its distribution (Gaussian or non-Gaussian) and its possible non-linear behavior in some sections. The test proposed by Hinich [49] assumes two different hypotheses: the first one is to verify if the process is Gaussian distributed or not; then, if the first hypothesis is accepted, the second one consists of verifying whether the process is linear or not. Therefore, for Gaussianity, we have a Probability of False Alarm (PFA) parameter that indicates the probability that our data have a non-zero bispectrum. Hence: • H 1 : If PFA ≤ 0.05; the process is non-Gaussian, i.e., non-zero bispectrum. • H 0 : If PFA > 0.05; the process is Gaussian, i.e., zero bispectrum.
Then, if H 1 is true, we can testỹ t to verify if it contains linear or non-linear behavior. Since this test assumes a non-zero bispectrum, it should only be considered if and only ifỹ t obtained a PFA ≤ 0.05. Thus, R est is the estimated and R th is the theoretical interquartile range of a chi-squared distribution: • H 1 : If R est >> R th or R est << R th , the process is non-linear. • H 0 : If R est ≈ R th , the process is linear.
Under these assumptions, we use the High Order Spectral Analysis (HOSA) toolbox for MATLAB R to conduct the test. Results in Table 6 show that all ourỹ t vectors are non-Gaussian distributions and contain non-linear processes within them. Conclusively, for providing a more accurate analysis, future experiments should include a model that considers the estimation of these potential non-linear processes as well. For more details, see [49,50].

Discussion
The main contribution of this paper is to provide a complementary method to avoid continuous repetitions of FSI simulations to know the behavior of an idealized and simplified academic problem, which studies two leaflets inside a channel and its interaction with a surrounding flow. We propose the use of a SI model to reduce the excessive computational time required to perform many repetitions of the same design under different flow conditions. The 2D leaflets' structure simplification is taken from the experimental case found in [42], and it considers as outputs the opening area and pressure difference of the leaflets under a pulsatile flow. For our case, the input signal is only transmitted along two directions (x and y planes) and not in three, as the experimental setup. Besides, the experimental data are averaged across all cycles to produce a single representative cycle, reducing all possible measurement noise. On the contrary, FSI simulations are merely carried out for a single cycle; hence, noise reduction by averaging the signal was not possible.
The SI method proposed is an ARX model that uses the information retrieved by the FSI simulations of a parametric study to obtain coefficients for the transfer functions that represent the specific outputs. With the parameter vectors θ E,T obtained, we create coefficient maps to observe the behavior of each one for all the cases studied. Using these coefficient fields, we employ interpolations to obtain the values for the desired conditions in the validation process. These validation parameters are selected to have two cases within the dataset and two outside the dataset. The coefficients obtained from this step are used to create the transfer functions that estimate the trend of the output without the need for conducting continuous FSI simulations. Finally, the estimation performed for the validation process proves that the closer the parameters are to the dataset, the better the agreement of the simulated signal to be compared with the FSI results. On the other side, the cases that are outside or slightly far from the conditions of the dataset obtain a less accurate estimation, which however, still follows the trend of the signal.

Conclusions
A first approach to simulate the output of a simple leaflet model by employing a system identification method is presented. This model uses the information from previous FSI simulations to be trained and obtain a coefficient vector, which is later used in addition to an input signal to simulate the opening area and pressure difference outputs. Despite the fact that FSI simulations are a more complete tool to study this phenomena, our model could provide complementary help to obtain results in a faster way once it is trained. This method provided accurate tracking of the trends of those outputs with no transitory behaviors, but it does not sufficiently simulate the behavior of outputs that contain transitory ones. In this way, if we consider data retrieved from FSI simulations of more realistic 3D geometries with real patient data, we could also train our ARX model to get the necessary coefficients in order to simulate the output of such scenarios. This could lead to using this supplementary method in situations where faster analysis is required in hospitals without having to wait for longer periods, as it would be by only using FSI simulations. Furthermore, speaking of clinical or practical applications, with the implementation of a moving average (ARMAX) term, which minimizes the error, we could propose a prognostic health management process. By using data from real patients, we could make a classifier that, according to the coefficients obtained in the ARMAX model, we could infer the current health status of a person. The main difference with the ARX model is that ARMAX also considers the original output for the prediction error reduction and the adjustment of the coefficients.
The parametric study presented in this paper considers FSI simulations with five variations in the elastic modulus E and nine variations in the period T, and it is used to train the model under a small set of conditions. Additionally, the ARX model trained takes approximately 20 minutes to finish all the coefficients calculation, which represents only 0.0465% of the time used by FSI simulations. Furthermore, the data generated by the FSI simulations need a post-processing (another five minutes approximately) before we can use them in the ARX model; giving a total time of less than 25 minutes to finish the training and validation processes of all data. The validation process is conducted by using interpolation to obtain the transfer functions that represent the conditions that are not initially considered for the parametric study. The proposed ARX method is able to follow the trend with 90.14% and 92.27% of accuracy respectively for S v and ∆P. The identification stage considers 45 combinations per output; whereas, for the validation part, we obtain an average accuracy of 93.31% and 83.08% for S v and ∆P for four combinations per output. The validation results show that our model is able to follow the behavior of both outputs in a better way when the conditions are inside or close to the boundaries of our training dataset conditions. Finally, a linearity test proves that the prediction error y t of these signals contains non-linear behavior; hence, future work includes further study of these processes through a non-linear system identification model.  Table S1. Results obtained for the solid mesh size test, Figure S1: Validation test for the solid mesh selection, Figure S2. Meshes distribution -sections-within the layout., Figure S3. Validation test for the fluid mesh selection, Figure S4. Displacement observed in the top leaflet after T= 3s, Figure S5. The behavior of the leaflet indicates oscillation moments increased as soon as the elastic modulus did, whereas the magnitude of these reduces with a higher E value, Table S3. Displacement of each E under different forces applied, Figure S6. Reconstruction of the signals using the ARX model for E = 0.1 MPa. In a) we have the opening area and in b) the pressure difference, Table S4. Coefficients obtained for a) Sv and b) DP corresponding to E = 0.1 MPa., Figure S7. Reconstruction of the signals using the ARX model for E = 1 MPa. In a) we have the opening area and in b) the pressure difference, Table S5. Coefficients obtained for a) Sv and b) DP corresponding to E = 1 MPa., Figure S8. Reconstruction of the signals using the ARX model for E = 3.16 MPa. In a) we have the opening area and in b) the pressure difference, Table S6. Coefficients obtained for a) Sv and b) DP corresponding to E = 3.16 MPa., Figure S9. Reconstruction of the signals using the ARX model for E = 10 MPa. In a) we have the opening area and in b) the pressure difference, Table S7. Coefficients obtained for a) Sv and b) DP corresponding to E = 10 MPa.
Author Contributions: C.D.-H. was responsible of the investigation, configuration, and testing for all the material presented and also the writing of this article. G.E. and R.L.-A. supervised all processes and experimentation in this work and also contributed with corrections, guidance, writing, reviewing, and editing of this document.