Noise Reduction in the Swept Sine Identiﬁcation Procedure of Nonlinear Systems

: The Hammerstein model identiﬁcation technique based on swept sine excitation signals proved in numerous applications to be particularly effective for the deﬁnition of a model for nonlinear systems. In this paper we address the problem of the robustness of this model parameter estimation procedure in the presence of noise in the measurement step. The relationship between the different functions that enter the identiﬁcation procedure is analyzed to assess how the presence of additive noise affects model parameters estimation. This analysis allows us to propose an original technique to mitigate the effects of additive noise in order to improve the accuracy of model parameters estimation. The different aspects addressed in the paper and the technique for mitigating the effects of noise on the accuracy of parameter estimation are veriﬁed on both synthetic and experimental data acquired with an ultrasonic system. The results of both simulations and experiments on laboratory data conﬁrm the correctness of the assumptions made and the effectiveness of the proposed mitigation methodology.


Introduction
The behavior of physical systems is very often modeled using linear techniques. In reality, only in very particular cases physical systems can behave as linear, so if we want to obtain high quality results when mathematical models of physical systems are involved, it becomes of remarkable importance to have models that can represent the behavior of the system also in a non-linear regime.
The modeling of nonlinear dynamical systems is one of the most challenging research areas in the field of system representation. Much of the research developed in recent years has increasingly focused on predominantly data-driven approaches, including neural nets or fuzzy logic [1], methodologies that assume the availability of large amounts of data. The amount of available data is not always adequate to represent the complexity of the system and the computing power is sufficient to handle them. For this reason, they continue to be widely diffused in the white-box approaches and the gray-box approaches, which place side by side to a number of data-driven only, black box methods: Such techniques integrate in different measures the information derived from the knowledge of the physics of the system and information extracted from the data produced by the nonlinear system.
The white-box approaches require an accurate knowledge of the system. These techniques rely, for example, on wave digital filters [2,3] or differential equations [4] to obtain an accurate model of the real system.
If one has detailed knowledge of the physical system, these models can be highly accurate. If it is not possible to adequately represent the overall physical system by equations, the gray and black-box approaches can be applied, as they require only partial knowledge of the system or only the knowledge of input and output signals. A partial knowledge of the physics of the system allows to fractionate it into connected subsystems, each representable in a simpler way by adopting a gray-box approach [5,6]. Among the black-box approaches, the methods derived from the Volterra series are in a prominent position, both in terms of accuracy and historically, having been defined long before the neural network-based approaches. Modeling techniques based on this infinite series are obtained by truncating them to a finite number of terms [7]. It is however important to note that, even considering a truncation to a low order of the Volterra series, the number of coefficients required to define the relevant model quickly becomes very large. This is an important limitation, because it makes it possible in practice to define adequate models by following this path only in the case where the system shows a degree of nonlinearity of small entity. This has prompted the development of simplified expansion models of the Volterra series [8]. Among these, the Hammerstein and Wiener models are among the most popular. Having available accurate models and effective techniques for their identification has allowed the successful application of these nonlinear modeling techniques of physical systems in many engineering fields, notably including acoustics and nondestructive testing [9][10][11][12].
Among the different types of black-box nonlinear models of physical structures, the parallel Hammerstein model is particularly interesting as it can be shown that the problem of identifying its parameters reduces to a linear problem, so classical least squares methods can be used for their estimate [13]. An efficient technique was proposed in recent years for the identification of the parallel Hammerstein model [14][15][16]: Its relative simplicity of implementation, and the excellent results obtained both in laboratory experiments and in practical applications, make it one among the most competitive techniques for the identification of the Hammerstein model: A swept sine signal is input to the nonlinear system to be modelled and the corresponding output is acquired. Processing this output signal allows to identify the parameters of the model, as will be detailed in the following sections.
An aspect of this technique, which is of great importance in practical applications, but which has been only partially addressed in the technical literature, is the analysis of the effects of noise that inevitably adds to the useful signal in the acquisition step of the identification procedure. The problem has been addressed in some particular cases such as in [17], where an iterative procedure for the identification of the Hammerstein model is defined in the particular case in which the noise, added to the useful signal at the output of the system to be identified, can be modeled as a correlated noise, assumed to be the output of a digital filter with a zero mean random sequence as input. In [18], a protocol is proposed to determine the location of the process noise in a Wiener-Hammerstein system with respect to the static nonlinearity by using a periodic, nonstationary input test signal. Hypothesizing the location of the process noise is addressed as a preliminary step in order to improve the successive model identification. In [19], the hypothesis that damage causes the structure to exhibit a nonlinear response is tested, and thus the use of Nonlinear Model Based Features is shown to increase classification performance: although the study does not directly analyze the effects of measurement noise in the model identification phase, it evaluates the performance degradation of the classification procedure in the presence of noise added to the measured signals at the output of the system under study. In [20], the identification of the nonlinear model is considered in the presence of uncertainties: Under the assumption of stationarity of the system to be modeled, the effect of uncertainties is reduced by using repetitions of the input swept sine signal, which is repeated several times in order to excite the system with a pseudo periodic signal; a period synchronized with the duration of the swept sine signal is adopted to simplify the implementation. Rebillat and Schoukens in [21] compare two methodologies of identification of the Hammerstein model by evaluating, for both, specific performance indexes on systems defined as bench test, and in the presence of noise.
In the present paper we deal with the problem of the accuracy in the identification of the Hammerstein model using the identification technique based on exponential swept sine test signals. We consider the presence of noise that is added to the useful signal when measuring the response of the nonlinear system. In the paper we propose an original analysis of the link that exists between the environmental noise superimposed on the measurement result and the uncertainty in the estimation of the kernels that identify the Hammerstein model in the case that the identification technique is based on exponential swept sine signals. This original analysis, besides clarifying the connection between the presence of measurement noise and the uncertainties in the estimation of the different parameters characterizing the identification procedure, represents the basis to define a novel technique, that we propose in the paper, to mitigate the effects of the superimposed noise on the quality of the estimation of the parameters that identify the model. The technique we propose is based on the use of lowpass filters, each designed to operate on one of the kernels, characterizing the different branches of the Hammerstein model.
The different aspects of the uncertainty reduction procedure are verified both on synthetic data and in an experimental case. The latter is an ultrasonic measurement setup with transducers operating in air, chosen because the possible effects due to measurement noise are particularly relevant: The limited impedance matching and the attenuation due to air propagation make this an experimental setup where the signal-to-noise ratio can degrade significantly.
The paper is organized as follows: In Section 2, the theoretical aspects of the procedure are detailed. In Section 3, the experimental tests carried out on both synthetic and real devices are described. Section 4 discusses the results obtained, draws conclusions, and indicates possible evolutions of the work.

Theoretical Aspects
In this section we will briefly describe some concepts at the basis of the procedure for reducing the effects of measurement noise in the identification of the Hammerstein model based on swept sine signals: The procedure for model parameters identification is first recalled, then the effect of noise on the assessment of model kernels h i (t) is evaluated. Finally, the frequency bands characterizing the signals involved in the different steps of the identification procedure are analyzed.

Model Identification by Swept Sine Excitation
If we assume that a valid representation of the nonlinear system is the Hammerstein model, a schematic description of which is shown in Figure 1, the input-output relation of the system will be: where we indicated with the symbol ⊗ the convolution operation. The output y H (t) is the sum of the powers up to order N H of the input signal x(t), convolved with appropriate impulsive functions h i (t); such impulsive functions h i (t) are the kernels that identify the Hammerstein model. Assume the input is a swept sine, i.e., an harmonic signal x(t) = R Cos(φ(t)) of amplitude R, with instantaneous angular frequency ω(t) = dφ(t)/dt, which varies over time. If we represent vectors and matrices by square brackets, we can write for the output of the Hammerstein model of order N H : where [h(t)] is the vector of the different kernels h k (t), k = 1, . . . , N H , and [R Cos(φ(t))] k T is the transpose of the vector of powers of the input signal. By means of the Chebyshev polynomials of the first kind, we can express the harmonics Cos(k φ(t)) of the swept sine function Cos(φ(t)), as functions of the powers [R Cos(φ(t))] k of the input signal [16]: The entries of matrix [A c ] are the coefficients of Chebyshev polynomials of the first type, and [R c ] is the diagonal matrix in which the index term {i, i} is the i − th power of the amplitude R. From expression (3), we can derive the expression of [R Cos(φ(t))] k as a function of the harmonics Cos(k φ(t)) which, substituted in (2), gives: where we considered that [R c ] is a diagonal matrix, and we defined the vector [g(t)] (2) and (4) tells us that the output y H (t) of the nonlinear system, obtained when the swept sine signal x(t) = R Cos(φ(t)) is input, can be expressed either as the sum of the convolutions between the powers of the harmonic signal itself and the functions in [h(t)], or, alternatively, as the sum of the convolutions between the harmonics Cos(k φ(t)) of the signal Cos(φ(t)) and the impulsive functions [g(t)]. [R c ] depends on the amplitude R of the input signal and on the matrix [A c ], whose entries are the coefficients of the Chebyshev polynomials of the first kind.

comparison between Equations
The identification procedure of the Hammerstein model, and therefore of the functions [h(t)], is based on this correspondence between the [h(t)] and the [g(t)]: assume that the harmonic function: x(t) = R Cos(φ(t)) at the input is such to modify its instantaneous frequency by following an exponential law, i.e., assume that, if T 0 is the duration of the swept sine signal ranging between the frequencies f MI N and f MAX , the angular frequency is defined by ω(t) = dφ(t)/dt = 2 π f MI N Exp(t/L), where the constant L is defined as L = T 0 /ln( f MAX / f MI N ) and describes how quickly the frequency changes over time. It is easy to verify that the k − th harmonic of the input signal corresponds to a simple shift of the signal x(t) by the quantity ∆t k = L ln(k). In fact, we can write for the instantaneous frequency f (t): If, moreover, the input signal complies with specific constraints on its instantaneous phase, as detailed in [15], we can process the output y H (t) with the matched filter, i.e., a filter whose impulse response ψ(t) is such that its convolution with the exponential swept sine signal x(t) = R Cos(φ(t)) producesδ(t), a band-limited approximation of Dirac's delta function δ(t). The output of the matched filter ψ(t) will then be given by: where the k − th element of the vector δ (t + ∆t k ) is the band-limited approximation of the Dirac delta function shifted in time by the quantity ∆t k = L ln(k) defined above. The output u(t) of the matched filter, when the signal y H (t) is input, consists therefore of a sequence of impulsive functions g i (t) starting at the times identified by the shifts ∆t k = L ln(k) related only once the input swept sine signal is defined, to the order k of the harmonic. If the parameter L is large enough to keep the different g i (t) apart, each of the g i (t) functions can be obtained by simply extracting an appropriate section in time of the output signal u(t), starting at a time instant identified by ∆t i . The duration of the time window beginning at ∆t k depends on the length of the impulse responses h k (t) and can only be established by trial and error.
Once the [g(t)] functions have been obtained by sectioning the output u(t) of the matched filter, from the [g(t)] it is possible to derive the [h(t)] functions that identify the Hammerstein model through a simple linear transformation: Figure 2 shows graphically the processing procedure to obtain the functions [g(t)].

Noise Power on the Output y H (t) and on Functions g(t) and h(t)
We have seen that the swept sine identification procedure of the Hammerstein model starts by considering the functions g i (t), obtained by time-windowing at appropriate positions the response u(t) of the matched filter when its input is y H (t). The response y H (t) is the one that we have at the output of the nonlinear system under consideration when it has an exponential swept sine signal as input. From the functions g i (t) obtained with this procedure, through an appropriate linear transformation we can derive the functions h i (t) that characterize the Hammerstein model; the coefficients of this linear transformation are the coefficients of Chebyshev polynomials of the first type. To obtain the functions h i (t), a combination of the functions g i (t) is performed according to (7). The combination is performed through coefficients given by matrix [R c ] −1 [A c ] T . For example, for a model of order 4, we will have: Assume that additive noise, superimposed on the useful signal, affects the acquisition of the output to the nonlinear system. The ideal output y H (t) of the nonlinear system will be superimposed by the noise sequence N(t), supposed to be white Gaussian with power spectral density P N ( f ) = N 0/2. The noise-affected output will be y H N (t) = y H (t) + N(t).
According to the identification procedure, the output signal y H N (t) is filtered through the matched filter ψ(t): the noise sequence N(t) will thus be filtered giving rise, at the output of the matched filter, to the noise sequence n u (t) superimposed on the signal u(t). If we say Ψ( f ) is the Fourier transform of the impulse response of the matched filter: Ψ( f ) = F {ψ(t)}, the power of noise signal n u (t) at the output of the matched filter can be evaluated by the: We denote with σ 2 n u the variance of noise process n u (t) superimposed on u(t): all functions g i (t), are extracted from u(t) at different positions, and will be thus affected by superimposed noise signals n g i (t) whose variance will be σ 2 n u for all of them: σ 2 n g i = σ 2 n u . This common variance will be denoted in the following as σ 2 g ; furthermore, the overall noise sequence n u (t) has an impulsive-like correlation function; consequently, the noise sequences n g i (t), extracted at different positions from n u (t), will be uncorrelated with each other, as they refer to different portions of the overall noise sequence n u (t).
By following this reasoning, it is straightforward to evaluate the variance of noise sequences n h i (t) superimposed on each one of the h i (t) kernels from the variance of the noise sequences n g i (t) added to the g i (t): the noise sequences n g i (t) (uncorrelated, all with zero mean and variance σ 2 g ) are combined with each other by using the same coefficients as the g i (t); in the linear combination, the noise amplitudes n g i (t) will be altered according to the coefficients [R c ] −1 [A c ] T seen above. The noise sequences n g i (t) are not correlated with each other, so the variances of noise sequences n h i (t), denoted as σ 2 h i , will then be obtained by combining the variances σ 2 g with coefficients equal to the square of those in , where by {[.]} 2 we denote the matrix whose terms are the squares of the individual elements of the matrix [.]. If σ 2 h and σ 2 g are the vectors of σ 2 h i and σ 2 g , we have: In the example of model order N H = 4, remembering that the variances σ 2 g are all equal to each other, the variances σ 2 h i will be obtained via the linear combination:

Frequency Band of Functions g(t) and h(t): Noise Reduction on h(t) by Lowpass Filtering
Equation (10) expresses the connection between the variance σ 2 g of noise superimposed to each function g i (t) and the variance σ 2 h k of noise overlapping the functions h k (t) in the case where the identification procedure is the one described in the previous sections.
We already mentioned that, under the assumption of stationarity of the system under measurement [20], proposes to adopt repetitions of the input swept sine signal to reduce the amount of uncertainties arising from the presence of superimposed noise. In the present paper we propose a different way to achieve a mitigation of the effects of additive noise.
The procedure we propose deserves some further consideration about the bandwidths of the different signals involved. Equation (10) connects the variances of noise processes superimposed on functions of different types, considered all full-band, i.e., with harmonic components that can reach the Nyquist frequency. In reality, the different signals whose variances are combined in (10) relate to different frequency bands: If the excitation signal x(t) has harmonic components between f MI N and f MAX , the kernel h k (t) present on the k − th branch of the Hammerstein model will be excited in the frequency band associated with the k − th power of that signal, that is, in the frequency range between f MI N and k * f MAX , having excluded the DC component, if any; it is, therefore, sufficient to characterize the kernel h k (t) up to frequency k * f MAX , since it will not be stressed for higher frequencies.
According to the swept sine identification procedure, the kernel h k (t) is obtained through Equation (7), i.e., through a weighted combination of g i (t) functions, where the i − th function g i (t) is associated with the i − th harmonic of the input signal, and thus with a frequency range from i * f MI N to i * f MAX . The combination matrix present in Equation (7) is an upper triangular matrix, and thus the functions g i (t) that contribute to build the k − th kernel h k (t) are those with indices ranging from k itself to the value N H i.e., to the maximum order of the Hammerstein model. Thus, according to Equation (7), to obtain the functions h k (t) (that we know are defined in the frequency band limited to k * f MAX ), we linearly combine the functions g i (t) whose frequency band is wider than that of the h k (t) we are seeking. The frequency range of the combination of the g i (t) needed to obtain h k (t) has an overall harmonic content ranging from k * f MI N to N H * f MAX , so it extends to frequencies much higher than those we know characterize the h k (t) function we want to identify. This implies that if we lowpass filter the linear combination of the g i (t) functions contributing to h k (t), and limit the contributions to the useful band of each h k (t) , i.e., to k * f MAX , we can sensitively reduce the noise effects without altering the quality of the identification result.
This filtering operation has to be carried out in a frequency band whose range depends on the order of the function h k (t) to be identified, so for each kernel h k (t) an ad hoc lowpass filter will have to be defined.

Experimental Results
The experiments reported in this section are aimed at verifying the different aspects highlighted in the previous section of the paper. A first aspect is the relationship between the noise superimposed on the g i (t) functions and the noise superimposed on the kernel h k (t) that identify the Hammerstein model: This aspect is tested in the following Section 3.1. The subsequent sections are instead aimed at verifying the effectiveness of the technique of lowpass filtering to mitigate uncertainty in the identification of kernels h k (t) due to the effects of noise superimposed in the measurement phase: Section 3.2 reports the results of noise effects mitigation obtained in a simulated experiment, while Section 3.3 reports the results obtained when measuring real data collected in a measurement bench with ultrasonic probes operating in air.

Noise on the g i (t) Versus Noise on the h k (t)
In this section, we verify expression (10), which we have shown is the functional link between the power of noise superimposed on functions g i (t) and the corresponding power of noise superimposed on the h k (t) estimated by combining the g i (t). We start with the simulation of the noise sequence superimposed on functions g i (t) and estimate their variances; we then operate the linear transformation (7) and obtain the corresponding noise sequences affecting the h k (t) kernels; in this case too we estimate the variance vector σ 2 h characterizing these noise sequences. We thus compare the latter estimates σ 2 h with those obtained by means of Equation (10). Figure 3 plots the noise sequences: The first line of Figure 3 reports an example of the noise sequences obtained by truncating at different ∆t k the function u(t), in the absence of input excitations: The value of the noise powers in the four segments, estimated by aver-aging over 50 trials, is σ 2 g = 0.0098 0.0090 0.0101 0.0099 T . The second row of Figure 3 shows the noise sequences obtained by the transformation (7). Noise powers, estimated from the sequences, are σ 2 h = 0.0157 0.0048 0.0006 0.0002 T .

Figure 3. Correspondence between noise sequences superimposed on g(t) and h(t). Plots (A-D): noise sequences on
We then compare these estimates with the corresponding variance values predicted by using equation (10), which gives the estimates of the same σ 2 h(t) starting from the σ 2 g : we obtain from (10): σ 2 h = 0.0154 0.0047 0.0006 0.0002 T .
The percentage estimate error made adopting (10), evaluated by means of σ 2 i_ERR% = 100 σ 2 h k −σ 2 h k /σ 2 h k , in the case of the above example gives: The low error values confirm that the variance σ 2 h of the noise superimposed on the kernels [h(t)] can be accurately obtained from an estimate of the power of the noise σ 2 g superimposed on the functions [g(t)] by adopting Equation (10).

Mitigation of Noise Effects: Simulation Tests
We begin testing the effectiveness of the procedure proposed for mitigating the effects of noise in a simulation experiment: The importance of performing simulation tests is to have a complete feedback of the improvement we achieve in the estimation of the kernels h i (t): Their reference trends are known, since they have been analytically defined to be included in the synthetic system. Consequently, a direct comparison with the result of the identification procedure in the presence of noise is possible, and we can quantify the benefits arising from the proposed mitigation procedure.
We have generated a synthetic nonlinear system of the fourth order, defined according to the Hammerstein model. In the different branches of the model we adopted as kernels h i (t), the four functions are represented in Figure 4. They all have typical behavior of the impulsive response of a filter of the second order and differ for the central frequency and for the attack delay time, which both increase as the order increases. The values adopted in our simulations to characterize the four kernels of the synthetic model are reported in Table 1.   For the purpose of verifying the accuracy in model identification, the system is elicited with an exponential swept sine signal of amplitude R = 1 whose frequency range f MI N = 40 [kHz] to f MAX = 1600 [kHz] contains the frequency bands that characterize all the h i (t) kernels.
A Gaussian white noise is added to the response y H (t); the value of its standard deviation is such that the signal-to-noise ratio (SNR) is brought to the desired level: The values SNR 1 = 10 [dB] and SNR 2 = 5 [dB] were considered. Let us say y H N (t) is the noisy response. It is convolved with the filter matched to the input signal to obtain the output signal u(t). The functions g i (t) are obtained by taking portions of u(t) at the time instants ∆t i . From g i (t), the estimates of the kernels h k (t) are obtained. To verify the effectiveness of the noise reduction technique, we report both the results of the estimates in the absence of lowpass filtering and the results obtained after the additional lowpass filtering step of each of the estimated h k (t) kernels. The impulse response of the FIR (Finite Impulse Response) filter is defined using a frequency sampling method, starting with the amplitude values that define the required low-pass amplitude response in the frequency domain. Figures 5 and 6 show the superposition between the estimated h k (t) kernels and the corresponding reference trends, respectively, for the case SNR 1 = 10 [dB], and for the case SNR 2 = 5 [dB]: The upper row of each figure refers to the estimate without filtering and the lower row shows the trend of the estimated h k (t) after the additional lowpass filtering step. The figures clearly show that the lowpass filters defined ad hoc for each order of the kernel greatly reduce the effects of superimposed noise without deteriorating the time course of the function we want to estimate.
, h 4 (t) estimates before lowpass filtering and reference. Plots (E-H): h 1 (t), h 2 (t), h 3 (t), h 4 (t) estimates after lowpass filtering and reference. In order to give a quantitative evaluation of the improvement effect allowed by the filtering technique proposed in this paper, we adopted the performance index proposed in [21]: where h k [j] is the k − th reference kernel andĥ k [j] is its estimate, N s is the number of samples of both the actual and estimated kernels, and N H is the order of the Hammerstein model.
The values of the indices calculated both before and after filtering for the two cases of signal to noise ratio equal to 10 [dB] and 5 [dB] are shown in Tables 2 and 3.  It is evident from the values shown in Tables 2 and 3 that both in the less noisy case and in the noisier one, the action of filtering is such as to significantly reduce the index value, and, therefore, improve the signal to noise ratio obtained.

Mitigation of Noise Effects: Ultrasonic System with in-Air Propagation
A verification of the proposed procedures was also carried out in the case of laboratoryacquired data. The measurements were performed on an ultrasonic acquisition system designed for in-air propagation, as shown in Figure 7. The measurement setup consists, in sequence, of a personal computer (PC) for management and supervision; an HS5 TiePie handyscope used as a generator of arbitrary waveforms; a Falco-System power amplifier; a pair of identical non-contact point focused ultrasonic transducers by Ultran, used in transmission and reception; again the TiePie, used this time as a data logger; and the PC for data storage. The ultrasonic probes operate at a center frequency of 200 KHz, and the emitted beam is focused at a single point (F = 10 cm). The probes allow a maximum input voltage of 150 V. The Falco-System power amplifier was used with an amplification factor of 50×. The probes, a pair of ULTRAN NCG 200-D25 P100 focused probes, were mounted in a through-transmission configuration (through air alone) on precision mounts and placed at a distance of approximately 21 cm, the distance at which the received signal was maximized. The identification step was performed using an exponential swept sine signal of amplitude R = 1. It was assumed that the system could be usefully characterized by a swept sine signal of duration 9.3 [ms] operating in the range between 50 kHz and 400 kHz, sampled at 5 Ms/s. The order adopted for the Hammerstein model was N H = 4. Several measurements were carried out by slightly modifying the distance of the probes in order to change the level of useful signal received, and therefore the SNR ratio through slight defocusing. We report in the following the results of two experiments, which differ only in the value of the SNR ratio that characterizes them. Figure 8 shows, in an expanded scale, the trend of the g i (t) functions for the four orders considered in the case of the lower SNR value (estimated Peak Signal to Noise ratio on y H N (t) = 20 [dB]). They were obtained by taking the appropriate sections of the signal at the output to the matched filter. A first observation concerns the fact that even in the real data case, the hypothesis seems to be verified-the noise power on the function g i (t) is independent of the order of the function. The variance of the noise superimposed on the g i (t) functions, estimated in segments of the output signal u(t) preceding the instants ∆t k and averaged over 50 trials, gives the following values: The corresponding functions h k (t) are shown in Figure 9: In analogy with Figures 6 and 7, the first line of Figure 9 shows the trend of the functions h k (t) before the lowpass filtering operation. The second line shows the trends of the same functions after they have been lowpass filtered: Even at visual inspection, the improvement of the signal on the first three functions clearly appears, while it is also evident from h 4 (t), that if the noise level exceeds the useful signal, the filtering procedure can remove part of the noise, but cannot fully highlight the trend of the desired function, where present. We observe also in this real data case, that the filtering operation has not altered the temporal trend of the h k (t) functions, as far as it is possible to appreciate in cases where the noise level is not excessively high. The variance of the noise superimposed on the h k (t) functions in the case of the lower SNR value, estimated in the initial portion of the sequences and averaged over 50 trials, gives the values reported in Table 4: The positive effect of filtering is evident in all four orders. Improvements of different magnitudes are apparent for different orders: this will be discussed in the following section.   Figure 10 shows, in an expanded scale, the trend of the g i (t) functions for the four orders considered in the case of the higher SNR value (estimated Peak Signal To Noise ratio on y H N (t) = 31 [dB]).
The corresponding functions h k (t) are shown in Figure 11, which is organized as in the case of Figure 9. The variance of the noise superimposed on the g i (t) functions in the case of the higher SNR value, estimated in segments of the output signal u(t) preceding the instants ∆t k and averaged over 50 trials, gives the following values: σ 2 g = 1.28 1.14 1.65 3.81 T The variance of the noise superimposed on the h k (t) functions in the case of the higher SNR value, estimated in the initial portion of the sequences and averaged over 50 trials, gives the values reported in Table 5: The positive effect of filtering is evident in all four orders. Improvements of different magnitude are apparent for different orders: This will be discussed in the following section.  It can be interesting to compare the results obtained using the method we propose in the present paper with those obtained using the method proposed in [20]. Table 6 shows the results obtained by applying, on the same experimental setup, the noise reduction technique proposed in [20] in the case of 20 repetitions. We first observe that the comparison assumes the applicability of the method proposed in [20], and thus, that the nonlinear system is stationary, at least for the duration of the repetitions.   Table 6 shows that in the cases of the estimation of h 1 (t) and h 3 (t), the results obtained are comparable with those of the method proposed in this paper, but considerably lower than what should be theoretically expected in the presence of Gaussian white noise.
In the cases of h 2 (t) and h 4 (t) estimation, the improvement obtained with the averaging technique is in line with theoretical expectations. This can be explained by analyzing the trends over time, for example in Figure 11: The residual variability is related to an oscillation and not to white noise; as shown in [22], this variability of the signal results directly from the limited bandwidth ofδ(t), the approximated delta function, and thus from limitations of the identification technique, and is not associated with random noise; therefore, it cannot be removed either with filtering techniques such as those proposed in this paper or by averaging over repetitions of the experiment. If we want to draw conclusions about the comparison between the two methods, we can say that, where the noise level is very low, the two methods are equivalent; the method proposed in this work, in addition to not assuming characteristics of stationarity of the system, allows to obtain better results in the presence of high noise variance compared to the signal level.

Discussion and Conclusions
The various experiments we made were aimed at verifying the different aspects that had been highlighted theoretically in the paper. The correspondence between the noise superimposed on the g i (t) functions and the noise superimposed on the h k (t) functions, estimated from the g i (t) functions, was verified through the tests performed in Section 3.1: The estimates made on the noise sequences superimposed on the g i (t) functions and the h k (t) functions showed to be in line with what could be theoretically predicted using relation (10), confirming the validity of the proposed technique.
The subsequent Sections 3.2 and 3.3 had instead the purpose of verifying the theoretical hypotheses made on a possible improvement in the estimation of the h k (t) parameters, in the presence of noise, through a low-pass filtering operation. As a preliminary remark, since we wanted to consider the time behavior of the h k (t) functions, we chose to perform low-pass filtering through linear phase FIR filters, each with a specifically defined cutoff frequency related to the kernel order k, in order to preserve, as far as possible, the time course of the h k (t) functions linked to the harmonic contents present in the pass band of each low-pass filter. Both in the case of synthetic data and in the case of real data it was possible to find that the filtering operations significantly improve the quality of the estimate and do not significantly alter the h k (t) functions.
The results in Tables 4 and 5 deserve specific comment. The variance of the noise superimposed on the functions h k (t), estimated in the initial portion of the sequences and averaged over 50 trials, gives the values reported in the table: The positive effect of filtering is evident in all four cases and for both SNR levels, but especially in the cases of order 2 and 4. This effect, so different among the different orders, can be easily explained: In the calculation of h 2 (t) and h 4 (t), the transformation (7) implies a strong emphasis (with coefficient |8|) of the function g 4 (t) (the one of highest order N H = 4) and therefore of the noise associated with it: The high variance value of the noise on h 2 (t) and h 4 (t) before filtering can be ascribed to this aspect. The same function g 4 (t) is the one associated to the fourth harmonic, so the corresponding noise components have an harmonic content that includes high frequencies: They can be easily removed by the low pass filtering operation. This also explains the strong reduction, as a consequence of filtering, of the estimated variance.
The very good results we presented encourage us to work on further extensions of the research activity; we plan to use the techniques proposed in this paper in new experimental setups for further verifications. In addition, we want to test how these techniques combine with the filtering techniques proposed in [22] for the elimination of oscillations due to the band-limited swept sine excitation signal. Extending those techniques to the case of nonlinear device identification in the presence of noise could bring further improvements in the Hammerstein model identification step of a nonlinear system. Author Contributions: P.B. and M.C. equally contributed to all aspects of the paper. All authors have read and agreed to the published version of the manuscript.

Funding:
The authors acknowledge the support of the Fondazione Cassa di Risparmio di Terni for the partial financing of the research through the project "HydroTOUR-Hydrogen Terni Orizzonte Università e Ricerca".

Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.

Data Availability Statement:
The corresponding author can be contacted.