Application of Anti-Diagonal Averaging in Response Reconstruction

: Response reconstruction is used to obtain accurate replication of vehicle structural responses of ﬁeld recorded measurements in a laboratory environment, a crucial step in the process of Accelerated Destructive Testing (ADA). Response Reconstruction is cast as an inverse problem whereby an input signal is inferred to generate the desired outputs of a system. By casting the problem as an inverse problem we veer away from the familiarity of symmetry in physical systems since multiple inputs may generate the same output. We differ in our approach from standard force reconstruction problems in that the optimisation goal is the recreated output of the system. This alleviates the need for highly accurate inputs. We focus on ofﬂine non-causal linear regression methods to obtain input signals. A new windowing method called AntiDiagonal Averaging (ADA) is proposed to improve the regression techniques’ performance. ADA introduces overlaps within the predicted time signal windows and averages them. The newly proposed method is tested on a numerical quarter car model and shown to accurately reproduce the system’s outputs, which outperform related Finite Impulse Response (FIR) methods. In the nonlinear conﬁguration of the numerical quarter car, ADA achieved a recreated output Mean Fit Function Error (MFFE) score of 0.40% compared to the next best performing FIR method, which generated a score of 4.89%. Similar performance was shown for the linear case.


Introduction
In the drive for continual improvement in vehicle engineering design, optimised structures and components with lower safety margins and greater reliability are sought [1]. Advances in computational design such as finite element analysis and dynamic modelling combined with fatigue prediction have furthered this goal tremendously during the design phase. Nevertheless, there is still a need to dynamically test physical prototypes or existing designs in a controlled laboratory environment. For the analysis to be worthwhile the excitation of the structure in the laboratory environment must induce responses in the structure as though it were being tested under real-world operating conditions. The end goal is to enable Accelerated Destructive Testing (ADT) of the structure. In ADT, a vehicle's chassis is mounted with its suspension system on a set of hydraulic actuators. The hydraulic actuators then excite the system vertically. Laterally acting forces are simulated with additional actuators. An example of an ADT set-up is shown in Figure 1.
The structure's excitation is then carried out for extended periods, allowing for the degradation of the structure to be measured in a controlled environment [1]. The structure is not typically excited until catastrophic failure, but rather until the degradation measured as vibration or noise has met a specified threshold [2]. This indicates possible failure points of the system and a means of predicting the component's healthy lifespan. Other insights can be gained from dynamic testing, such as a better understanding of the system dynamics, vibration isolation [1] and vibration severities for passenger ride comfort [3]. The hydraulic actuator simulates the loads that the motorcycle would typically experience in the real world. A range of sensors such as accelerometers and strain gauges are used to capture the suspension system's dynamic response.
The biggest hurdle with ADT is that the inputs to the system, such as the displacements or the forces acting on the vehicle's tyres, are difficult or impossible to measure directly in the field. This means the problem must be cast as an inverse modelling or response reconstruction problem [4]. In inverse problems, the outputs of the system Z are used in conjunction with model parameters β to determine the inputs U, i.e., There are two possible choices for creating a model of the system. A mapping of the system can be constructed so that the system's inputs are used to predict the outputs of the system. This is referred to as the forward problem. The forward problem is then inverted. If the model used to map the problem is nonlinear, an iterative optimisation scheme is employed to invert the system. However, the optimisation scheme may be prone to local minima. The second approach is to create a direct inverse of the model whereby the system's outputs are used to predict the inputs of the system. The inverse method has an inherent stability check since the solution will only be obtained if the direct inverse model is stable [1]. However, we quickly find that most inverse problems are ill-posed. For a problem to be well-posed it needs to meet the following criteria: the solution is unique, the global solution exists for all data and the solution to the problem is continuously dependent on the given data [5]. The first criterion is normally the offending culprit since it is easy to construct a forward problem where two different inputs result in the same output. Therefore, the inverse solution is typically not unique. This introduces an asymmetry into the problem whereby the assumption of an one-to-one mapping is broken. If the problem is ill-posed we may use regularisation techniques to cast the problem as a more well-behaved problem. Regularisation techniques include: cross validation, SVD, iterative methods, data filtering and Tikhonov regularisation [4].
Most common reconstruction techniques are implemented in the frequency domain [6], whereby the discrete Fourier response is multiplied by an inverse or pseudoinverse frequency response function [7]. Raath's Ph.D. thesis [1] highlighted the then known issues of using frequency response techniques in accelerated fatigue testing. It was shown that the frequency response was inaccurate for several reasons, that include:

•
Assuming that the input and output signals are periodic when often they were not. These include sharp impulses from random impacts. • Being unable to model nonlinear models since frequency response analyses assume a linear model. • Requiring long time signals of the order of hours as opposed to minutes or seconds needed for the time domain. This ties in with the issue that low frequency information is easily lost due to spectral leakage where the energy in the lower frequencies is spread over to higher frequencies. • Failing to capture the sequence or causal effects which play an important role in crack propagation.
Various time-domain techniques have been developed to overcome this. However, they have been shown to be slow or inaccurate [8]. The vehicle structures of interest typically contain many nonlinear components such as springs and pneumatic dampers. Typical control systems will overcome this issue by linearising the system around the operation point. It is then assumed that the system will experience small perturbations around this point. However, it is expected that the system will experience impact loadings and large displacements, which will force the system out of its linear region [1]. Another issue associated with response reconstruction is that of model mismatch, whereby the identified system does not truly represent the physical test rig. In response reconstruction the misrepresentation occurs when the physical test rig is taken from the real-world and recreated and simulated in the laboratory environment. Typically the degrees of freedom are not fully represented in the laboratory or the test rig parameters, such as mass, may vary. In this laboratory environment, the process of system identification occurs; therefore, we will have mapped a domain that differs from the real-world domain. When the realworld outputs need to be recreated we may find that the mapped inverse model may be forced to extrapolate into regions in the mapped domain to find a solution. In other words, the inverse model has over-fitted to the laboratory domain and generalises poorly with regard to the real-world domain. Regularisation can be employed to minimise this error [9]. A related field to response reconstruction is force identification whereby the inputs of the system are of interest. However, the inputs of the system for a given output are not unique [10]. Force identification tackles this problem by enforcing some prior knowledge of the system dynamics to constrain the inputs to reasonable solutions. Bayesian methods have become prevalent in force identification literature since they allow the experimenter to systematically incorporate prior knowledge [11]. Another benefit of incorporating Bayesian methods is that it provides for confidence intervals on the input predictions and model parameters [12]. A noticeable distinction in force identification literature is that a known finite element model of the structure is typically assumed, i.e., a known forward model. The approach taken in this paper of solving the issue of non-uniqueness of the input is mitigated by • not focusing on the reconstructed input accuracies. • using cross validation of the system's reconstructed outputs to determine whether a given inverse model of the system is satisfactory i.e., using a forward pass through the physical system in each cross validation step to determine the model accuracy.
A potential drawback to this approach is that, if implemented naively, the cross validation can induce undue stress on the system before any ADT occurs. An overview of the response reconstruction methodology used in this paper is given in Figure 2. In the initial phase of response reconstruction, a set of input signals U train are designed in such a manner that they excite the desired dynamics of the system. The choice of the excitation signal is given in Section 2.2. The laboratory test rig is then excited by the inputs to obtain the corresponding outputs Z train . With these known inputs and outputs, an inverse model of the system can be mapped. Direct inverse system identification is used to obtain the model parameters β proposed . We cannot directly use the model parameters β proposed without regularisation. Cross validation is employed to determine the amount of regularisation required. The input U is not unique for a given output Z; therefore, we cannot easily compare the reconstructed inputÛ val against the known input U val . Instead, we pass the reconstructed input through the physical model to obtain the reconstructed outputẐ val which allows for direct comparison against the known output Z val . The discrepancy between Z val andẐ val is what we are trying to minimise in the cross validation step. This means that the cross validation step requires a physical forward pass through the physical laboratory model. The field collected data of the system Z test (for which we do not know the true inputs U test ) can then be inverted to approximate the real world inputÛ test given the final set of model parameters β final . The approximated input can now be used to recreate an approximation to the real world responseẐ test by using the inputs to excite the laboratory test rig. This final input can then be repeated indefinitely for ADT. An important distinction to make here is that we are not particularly interested in the inputs themselves even though we are employing inverse methods. We are instead interested in the quality of the reconstructed responses. This paper focuses on linear regression methods for mapping the relationship between the outputs X and inputs Y for response reconstruction, i.e., Xβ = Y. The core contribution of this paper is the proposed method of extending the capabilities of said linear regression methods by introducing overlaps and merging them using averaging in a process called AntiDiagonal Averaging (ADA), encapsulated in Equation (12). We show that ADA is closely related to FIR methods. We benchmark ADA in terms of its response reconstruction ability as well as its performance against the related FIR methods. We focus on Tikhonov regularisation with cross validation through the use of Ridge Regression (RR) to regularise the inversion of the system. Any suitable linear regression method can be employed with ADA; however, RR is needed for the FIR methods we cover.
We first give a brief overview of RR and how it enforces regularisation. The theory behind ADA is then introduced and compared against related FIR methods. The design of the investigation is then given with an overview of the numerical quarter car model, with which the reconstruction methods are benchmarked. The results of the benchmarks are then discussed. Finally, an illustrative comparison of the different regression methods is conducted showing the performance of the regression method on a challenging response reconstruction problem.

Ridge Regression
As opposed to discretely truncating the singular values, RR instead smoothly decays the singular values through the use of a regularisation matrix Γ which results in the solution where Γ is typically chosen as a scaling of the identity matrix though the use of the regularisation constant α, i.e., Γ = αI. RR has the solution in terms of the SVD of X [13] where the entries of the diagonal matrix D are given by (4) U x and V x are the left and right singular vectors of X, respectively, with the corresponding singular values s i .

AntiDiagonal Averaging (ADA)
Windowing methods are needed to represent the responses Z as the predictor matrix X ∈ R n×p and the inputs U as the target matrix Y ∈ R n×r in any suitable linear regression method. This is achieved by windowing said signals and treating each window as an observation. The original input and response measurements are given by U ∈ R m×q and Z ∈ R m×o where m is the original sequence length in samples and q and o are the number of actuator and sensor channels, respectively.
In ADA we introduce overlap between these observations. The overlap sample length s γ is defined by a proportion γ of the proposed window sample length s w , i.e., where s w is the window sample length given by the desired window length in seconds T w multiplied by the sampling frequency f s The stride of the window, s τ , is then given by This occurs for each time sequence for either an actuator or sensor signal, being appended column-wise, resulting in and the number of rows or observations, n, equal to We set the amount of overlap to the extreme such that the stride is one sample, i.e., s τ = 1. This results in the following windowed target matrix Y The windowed predictor matrix X takes on a similar form (not shown). We can simply average over the anti-diagonals of the windowed data matrixŶ to reconstruct the approximated inputÛ. To compute the average responseû(k) we average all the anti-diagonal terms ofŶ i,j , such thatû for which i + j = k + 1 and n diag is the number of elements in the anti-diagonal. This process is known as Hankelization which is the same process followed in Singular Spectral Analysis (SSA) [14]. The corresponding windowed matrix is referred to as the trajectory matrix. This is known as the embedding step in SSA. The windowed matrix is then decomposed using SVD. In this case, we are merely borrowing the ADA concept from SSA for the regression problem, whereas SSA typically uses this process for autoregressive models. An example signal with m = 7 samples and window length s w = 3 and windowed with ADA results in the following equation Here z is the response signal used to predict the inputs u. The linear coefficients β are computed using any suitable linear regression method. To gain insight into the workings of ADA we can write out the set of equations that inferû(3), i.e., We can then average over all theû (3) predictions to obtain the final prediction ofû(3) If we rewrite the average of the β multiplying with a particular z term as a new constant, e.g., β 2 = β 2,3 +β 1,2 2 , we obtain Here we note that the ADA emphasises the middle term with decreasing emphasis placed on proceeding and preceding terms. It in effect creates a triangular windowing function. If we add a corresponding weight term w, e.g., w 2 = 2 3 we can rewrite the equation generally aŝ This result demonstrates that ADA is an indirect method of creating a weighted moving average filter. In system identification this is known as a Finite Impulse Response (FIR) model. More specifically this an example of a non-causal weighted FIR model. The weights can be arbitrary and are a prior design choice. If we forgo the ADA method and use the weighted FIR model, we can be more creative with the weighting.

Finite Impulse Response (FIR) Models
In FIR models the current output of the system is a function of past inputs such that This is in contrast to other models such as Autoregressive eXogenous (ARX) which includes output feedback as well, i.e., This paper focuses on non-causal inverse implementations of FIR models, where the current input is a function of both past and future outputs, written as By using the FIR model, the predictor matrix X takes on the form with the corresponding target matrix Y written as It is worth noting that we lose the first and last s w /2 samples of the target matrix Y since we shifted the inputs to make the system non-causal.
The lack of feedback means that FIR methods are inherently stable. This is suitable and sometimes sought after if the system under consideration is stable. However, if the system is unstable, it will only approximate the instability for a short period before diverging [15]. FIR models come with the cost of needing significantly more terms than what output feedback models need to map the same system [15]. A similar approach to ADA can be achieved through the use of FIR models combined with Tikhonov regularisation. Using Tikhonov regularisation, the β coefficients can be penalised and thus shaped by choice of the Γ matrix in Equation (2). To this end three options for the Γ matrix are implemented in this paper, namely: Finite Impulse Response with Triangular Weighting (FIR-T), Finite Impulse Response with Difference Smoothing and Triangular Weighting (FIR-DT) and Finite Impulse Response with Ridge Regression (FIR-RR).
In FIR-T, the coefficients relating to the outputs further away from the required input (both forwards and backwards in time) are penalised. This is achieved by setting where W is an inverted triangular set of penalty weights, given as and α scales the amount of regularisation we wish to impose. This should ideally mimic the weighting function achieved by ADA in Equation (20). FIR-DT further modifies the triangular weighting matrix through the use of a first difference matrix A, given as The first difference matrix ensures that the difference between each successive β coefficient is small [16]. The difference matrix is then combined with the weighting matrix W to obtain the final form of the regularisation matrix such that This weighting scheme was initially implemented and developed for a causal FIR system where the penalty weights increased linearly further back in time [16]. Finally, the last choice of penalty matrix Γ is that of FIR-RR, i.e., where we only limit the magnitude of the weights to act as a reference. This enables us to determine whether the regularisation of β contributes to the accuracy of the response reconstruction.

Method
This section describes the general experimental design procedure for the numerical investigations.

Numerical Quarter Car Model
A simple two-degree-of-freedom nonlinear mass-spring-damper system, representing a quarter car model is used to investigate the methods explored in this paper. The numerical model employed is shown schematically in Figure 3. The sprung mass M A and unsprung mass M R represent the mass of the vehicle's body and the suspension-tyre system, respectively. These bodies are connected by springs and dampers, which represent the dynamics of the suspension system. The unsprung mass is then connected to the road via a spring characterising the tyre stiffness. The system is excited by a road profile u road . Figure 3. Two degree-of-freedom mass-spring-damper representation of the nonlinear quarter car model.
The system behaves according to the following equations of motion: where the k and b terms are the stiffness and damping coefficients, respectively. The nonlinearity is introduced by having cubic stiffening of the sprung mass spring controlled by the k NL term. The sprung mass spring force, f A , is given by where we define a new state of the system representing the deflection of the spring, ∆ z , such that The k NL term can be varied to change the severity of the system's nonlinearity or switch it completely off for linear behaviour. A hardening spring is modelled by choosing k NL > 0. This results in a spring that becomes stiffer as it undergoes compression or tension. Likewise, a softening spring can be implemented by choosing k NL < 0. In this study the linear component will always be restorative such that k A > 0. The default parameters chosen for the numerical quarter car are given in Table 1.

Choice of Excitation Signals
Before we can begin building a direct inverse model of our plant we need informative data since the excitation signal's quality places an upper bound on the accuracy of any subsequent model that we wish to build [15]. For response reconstruction, we can design the signals on which we want to train. There are two possible methods of designing excitation signals: model-free and model-based methods. In model-based methods subsequent excitation signals are chosen to improve the accuracy of the model [17]. In model-free methods we design an excitation signal that offers the best distributed coverage of the operating condition. Initially we have little prior knowledge of the system and of the real world input signals; therefore, we need to employ model-free methods. We assume we have some prior knowledge of the range of the operating condition. A suitable choice is the Amplitude Modulated Pseudo Random Binary Signal (APRBS).

Amplitude Modulated Pseudo Random Binary Signal
Since we are working with nonlinear structural systems, we know that the system responses are functions of input frequencies and the amplitude at which we excite the system. Therefore a signal that covers the necessary frequencies and the expected amplitude range of operations is required. The APRBS attempts to cover the amplitude operating conditions with a series of step responses that are fairly well distributed over the input range. An example of an APRBS is shown in Figure 4. To specify the profile, a set of N design points d n are chosen to define the step's amplitude. The design points are sampled from the desired range [u min , u max ] using Latin Hypercube Sampling (LHS). LHS splits the design space into N intervals with one design point placed randomly in each interval. LHS then iteratively optimises the design points such that each design point is the maximum distance away from its neighbours. This provides a random but equally spread set of design points. Since no physical system can achieve an instantaneous change in displacements required for a true step input, the step is instead approximated by a ramp function. The slope of the ramp is determined by the maximum allowed velocity v max that can safely or accurately be performed by the actuator. The ramp's slope affects the frequency content of the signals with higher velocities resulting in higher frequencies being excited [18]. The length of the step is then specified by the hold time T h . Since the testing time is limited, the maximum number of steps that best cover the input space in the shortest time is sought. The hold time T h must be small enough to fit as many steps in but must be long enough that the steps actively excite the system at that point. The hold time T h is typically set to be at least the length of the largest time constant T c,max of the system [15]. This can be determined with a simple step test of the system if no prior knowledge is known. The parameters used for the investigations are given in Table 2. Table 2. APRBS parameters used to generate the training and validation signals used in the nonoverlapping windows numerical investigation.

Road Profile
The ISO 8608 standard [19] for specifying road profiles is used to generate a separate test set to determine how well the direct inverse model performs on unseen data. The ISO 8608 standard defines inputs that are distinct from APRBS while still being representative of real-world operating conditions. The profiles are characterised by the standard in the frequency domain where the spectral density S z is given by for the given spatial frequency φ with units m −1 . The A term represents the road's roughness coefficient, whereas n represents the road index of the profile. The A coefficient controls how large the amplitudes are at each frequency whereas n controls how quickly the amplitudes decay as frequency functions. Varying types of profiles such as ploughed agricultural land to smooth gravel highways can be produced by altering these two coefficients. The spatial frequencies φ are limited between 0.5 and 10 m −1 . The former represents the broad changes in the landscape which have negligible effects on vehicle dynamics. In contrast, the upper limit on the frequency represents small variations which are filtered out by the tyre [20]. When generating the profiles only the amplitude information is given by the ISO 8608 standard; therefore, in order to generate time signals, a uniformly random signal is generated for the phase signal with spatial frequencies sampled at discrete intervals. This generates a displacement signal as a function of distance. The vehicle's velocity must be chosen to generate a displacement signal as a function of time. The parameters of the road profile used are given in Table 3. Table 3. Road profile parameters used to generate the test signal used in the non-overlapping windows numerical investigation.

Preprocessing
The windowing techniques covered in this paper will truncate some of the testing and training set samples. To ensure a fair comparison between the different data sets, a dead time is appended and prepended. The dead times will be excluded when calculating the cost function during cross validation and reporting the final accuracy of the predictions.
The constant initial and final conditions also allow for different signals to be concatenated without introducing unwanted jumps.

Scaling
The windowed inputs Y and windowed outputs X of the system record different types of signals which will have different variances across them. We may also find that constant biases need to be accounted for from the sensors. Therefore, the inputs and outputs are z-scored normalised to scale the rows to have a mean of zero and a variance of one [13].

Cross Validation
To determine the optimal regularisation constant α for RR, cross validation is used. However, cross validation can be misleading if it is implemented without considering the correlation between observations. Suppose the validation set is removed once the data have already been windowed with overlaps. In that case, the validated set will be correlated to the training set due to the overlaps introduced. If the validation set is first removed from the middle portion of the dataset and then windowed, then care must be taken when splitting and merging the training set to ensure that no unintended overlap is introduced between the separated training segments. A simpler solution to this problem is implemented by removing a single validation set from either the beginning or end of the dataset before windowing. In this work, a validation set was created independently of the training set.

Choice of Cost Function
We have the choice of either using the errors of the approximated inputs or the approximated outputs as the cost function of the optimisation scheme. In response reconstruction, we are interested in producing an accurate output response since a unique input may not exist. The downside of this is that, to obtain the output error, the approximated input needs to be passed through the test rig. This needs to occur for every loop in the cross validation step. The numerical model is computationally efficient to compute. However, this would result in significant fatigue of the experimental rig in the real world and would take considerable time to run. Therefore, it is necessary to limit the number of forward evaluations in the cross validation step. In evaluating these methods for response reconstruction, the output error is used during cross validation. Since we need to measure and compare response and input reconstruction accuracies across different types of signals, we need a normalised measure of error. The Mean Fit Function Error (MFFE) [21] is used to report the final test accuracies of the reconstructed input and output signals. MFFE is defined as where e 0 is the error between the true output z 0 and the approximate outputẑ 0 , i.e., The signals under consideration have been mean centred such that

Training Procedure
The cross validation algorithm consists of two sub-routines: an outer routine that incorporates the windowing parameter grid search for the optimal window length T w and an inner subroutine which optimises the regularisation constant α.

Window Loop
A graphical overview of the training process is shown in Figure 5 with focus on the window parameter search. The window optimisation loops over the window length, T w,i , where i represents the ith iteration of the loop. The training set U train and Z train as well as validation output Z val are then windowed accordingly. The z-score parameters, σ i and µ i , are then calculated using only the training dataset and applied to both the training and validation set. The training set is then decomposed using SVD according to the regression method specified, in this case RR. The decomposed SVD is then passed to the regularisation optimisation loop.  Figure 6 depicts a graphical overview of the regularisation constant optimisation. The regression coefficients β k are then calculated and weighted with α k , where k is the k th iteration of the loop. The approximated windowed validation inputsŶ val are then predicted using the windowed validation outputsX val . The approximated windowed validation inputs are rescaled and then merged using the specified windowing methods to obtain the approximated inputÛ val . The merged inputs are then passed through the test rig to obtain the approximated outputẐ val . The MFFE is then calculated between the true output Z val and the approximated outputẐ val . The optimised regularisation constant α k and the corresponding minimum MFFE are then returned from this loop to the windowing loop as seen in Figure 5. This minimum MFFE result is then used in the window loop to find the corresponding optimal window length T w,min .

Final Training Step
In the final training step, the training set is concatenated with the validation set. This newly combined set is then windowed with the optimised window parameters T w,min . The new z-score parameters [σ, µ] are then calculated. The combined set is decomposed and used in the regression step with the optimised regularisation constant α min to determine the final regression coefficients β final .

Prediction
A graphical overview of the prediction step and the approximation of the output is shown in Figure 7. Once the training step is complete, it is relatively straightforward to use the optimised parameters to make further predictions. The test output signal Z test needs to be preprocessed first before predictions can be made. To obtain the predictor matrix, X test , the test signal is windowed and z-scored normalised using the parameters determined during the training phase. The prediction step then occurs using the regression coefficients β final obtained during training to obtain the approximate target matrix,Ŷ test . The windowing and z-score normalisation are then reversed before passing the approximated inputÛ test into the test rig to obtain the approximated output,Ẑ test .

Comparison against Finite Impulse Response (FIR) Models
This section aims to benchmark ADA against FIR in terms of response reconstruction since ADA can be seen as a subset of FIR. The idea behind this benchmark is to ensure that ADA is not an indirect method of achieving an FIR implementation. If so, it needs to be determined whether ADA offers any substantial benefits over using FIR.

Finite Impulse Response (FIR) Comparison Procedure
The three different regularised FIR implementations will be compared against ADA combined with RR for two different test cases, linear and nonlinear. The experiment will be performed with a system configuration more representative of a typical test rig. In this case the sprung mass acceleration and spring displacement (i.e., the delta between the sprung and unsprung mass) of the quarter car will be used. The first being the linear system and the second being the default nonlinear system. The inputs and responses will be sampled at 250 Hz and 350 Hz for the linear and nonlinear systems, respectively. The window lengths will be determined via grid search cross validation with the window lengths being sampled from T w ∈ [0.1, 12] s with a grid of 50 equally spaced intervals. The potential α values used to regularise RR will be spaced equally on a log scale within the range α ∈ [s min × 10 −5 , s max ], where s are the singular values. Thirty equally spaced divisions will be used. An overview of the numerical experiment parameters is given in Table 4.

FIR Comparison Numerical Results
The reconstructed inputs and outputs for the linear and nonlinear systems are shown in Figures 8 and 9, respectively. The response reconstruction results for the linear and nonlinear systems are shown in Table 5. We treat FIR-RR as the bare minimum regression method since it does not impose a prior choice on the shape or smoothness of the β parameters.
For the linear case it appears that the imposed smoothness offered by FIR-DT does not contribute any significant improvement and actually hinders the reconstruction performance. If we refer to the optimised hyper-parameters for the numerical experiment in Table 6, we see that FIR-DT used a small amount of regularisation which further indicates the poor suitability of the methodology to the problem. We note that the triangular weighting offered by FIR-T performs similarly to FIR-RR, which suggests that the shape of the β parameters, is not as important for the linear case. However, ADA still performs an order of magnitude better in terms of the recreated output MFFE scores. This suggests, at least for the linear case, that the ADA performance is not necessarily due to the shape factor or due to imposed smoothing of each successive β parameter.
For the nonlinear case, we note that FIR-T, obtains the worst recreated output score with the default regression method, FIR-RR, performing significantly better. This suggests that the introduction of the triangular weighting is ill-suited for the nonlinear case. The introduction of the difference smoothing in the form of FIR-DT is an improvement over FIR-RR, which suggests that the smoothing of the β parameters is an improvement in the FIR regression methods' performance. If we refer to the optimised hyper-parameters in Table 7, we note that ADA implemented a low amount of regularisation for the nonlinear case. This indicates that averaging used in ADA adds an extra form of regularisation since it performs an order of magnitude better than the other regression methods for the nonlinear case, without relying on a large regularisation constant. This is corroborated by the fact that ADA outperforms the other regression methods when either smoothing is better suited (nonlinear case) or imposing a shape is better suited (linear case), which suggests that the averaging inherent to ADA is the key factor for its performance regarding the problem at hand.
In general, we note that the MFFE results for the recreated outputs of the system (for both the linear and nonlinear case) are lower than their associated input MFFE results. This indicates the non-uniqueness of the inputs for the given response since a seemingly poor input can result in an accurate output. This justifies the need to incorporate the forward pass through the system to determine the suitability of the input by judging it by its associated recreated output.

Illustrative Use Case
In this section, we create a scenario whereby all the challenges to response reconstruction are introduced. These are noise, model mismatch and nonlinearity. In this experiment we focus on a narrower scope of model mismatch whereby the model parameters of the system are simply scaled from the real-world environment to that of the laboratory environment. A broader scope of model mismatch would be to add new dynamics going from one environment to the other. One such example would be to add or remove a discontinuity, i.e., a tyre separating from the road surface. This paper focuses on this narrower view of model mismatch. The default parameters given in Table 1 are modified such that The investigation is not exhaustive but rather proposed to give an illustrative sense of the regression methods' performance on a challenging response reconstruction problem. To this end, the numerical experiment will be performed with the FIR and ADA regression methods with noise, model-mismatch and nonlinearity implemented. This investigation's level of noise is defined in percentage terms, η % , of the standard deviation for each channel o of the outputs z. The noise is assumed to be Gaussian with zero mean, resulting in Noise η % will be set to 5 %, model mismatch m % set to 10% and the non-linearity term k NL set to 1.28 × 10 7 N m −3 . In the case of model mismatch, the validation response set will come from a field recording instead of a laboratory recording. The idea behind this is to force the cross validation to only retain latent variables that allow the laboratory environment to recreate dynamics that are common to both the real-world and the lab environment. An overview of the numerical procedure is given in Table 8.  Table 3 Illustrative Use Case Numerical Results The response reconstruction results are shown in Table 9 with the corresponding reconstructed inputs and outputs shown in Figure 10. By referring to the results in Table 9, we see that ADA and FIR-DT perform similarly well for the reconstructed test results. These results are achieved within a close enough margin to each other that it probably falls within the uncertainty introduced by noise. We see that FIR-T performs poorly for the problem at hand. This follows the general trend of reconstruction performance as found in the previous nonlinear benchmark. The optimised hyper-parameters for this numerical experiment are shown in Table 7. Here we note that the different regression methods use similar window lengths, save for FIR-T, which used a significantly shorter window length.
Here we also note that the regularisation constants α are larger than those found in Table 6, which is to be expected since more regularisation is needed due the introduced model mismatch as well as the added output noise. Table 9. MFFE (%) scores for the approximated input and output signals using different FIR methods for an illustrative use case. Best performing results shown in bold.  (c) Reconstructed spring displacement Figure 10. Comparison of recreated input and output results using FIR methods against ADA for an illustrative use case.

Conclusions
By introducing the overlapping windows inherent to the ADA implementation as well as focusing on the recreated outputs of the system, we overcome the asymmetry introduced by the inverse nature of response reconstruction. In summary, it is shown that ADA combined with an appropriate linear regression is a suitable black-box method of reconstructing responses in dynamic systems. It has wide application in response reconstruction in that it can be readily applied to practical sensor configurations as well as non-linear systems. We compared the performance of ADA to related FIR regression methods. Although the experiments were not exhaustive, the results indicate that ADA outperforms the related FIR methods in response reconstruction accuracy. By repeating the experiment with challenges that require better regularisation, insights into how ADA may be performing regularisation was gained. The current ADA implementation can be seen as a post-processing smoothing step that occurs after a linear regression prediction. An exciting avenue to explore would be to replace the linear regression with a non-linear regression method such as a neural network.