Volterra black-box models identification methods: direct collocation vs least squares

The Volterra integral-functional series is the classic approach for nonlinear black box dynamical systems modeling. It is widely employed in many domains including radiophysics, aerodynamics, electronic and electrical engineering and many other. Identifying the time-varying functional parameters, also known as Volterra kernels, poses a difficulty due to the curse of dimensionality. This refers to the exponential growth in the number of model parameters as the complexity of the input-output response increases. The least squares method (LSM) is widely acknowledged as the standard approach for tackling the issue of identifying parameters. Unfortunately, the LSM suffers with many drawbacks such as the sensitivity to outliers causing biased estimation, multicollinearity, overfitting and inefficiency with large datasets. This paper presents alternative approach based on direct estimation of the Volterra kernels using the collocation method. Two model examples are studied. It is found that the collocation method presents a promising alternative for optimization, surpassing the traditional least squares method when it comes to the Volterra kernels identification including the case when input and output signals suffer from considerable measurement errors.


Introduction
At the current stage of development of wireless technologies like 5G/6G communication system networks based on antenna arrays with digital beam forming (Massive Multiple Input Multiple Output system), it is impossible to do without such digital signal processing algorithms as digital correction of the nonlinear distortion DPD (Digital Predistortion).Nonlinear distortions of the signal occurring inside the transceiver path strongly distort the spectrum of this signal, as shown in Fig. 1, where its shown in blue, and the main signal is red color respectively.Figure 1: Power spectrum density [1] However, the international wireless standards like 3GPP, ETSI impose strict requirements on the spectral power of the radiated signal.The use of digital nonlinear distortion correction algorithms allows to meet the requirements of standards and at the same time positively affect the overall efficiency, that is, the energy consumption of the entire signal receiving and transmitting system.There are different approaches to the implementation of such algorithms, both purely digital and analog and mixed.One of them, a purely mathematical approach to the description of nonlinear distortions, we will describe below.However, Let's consider the general statement of the problem of digital correction (DPD) with the following structure of the some model of correction as shown in Fig. 2.
Then we can formulate the requirements for the definition of parameters ⃗ W as follows: , the above introduced expression can be rewritten as This equation will be task of DPD (Digital Predistortion).Here we can highlight several important sub-tasks, which in themselves are quite complex both theoretically and computationally: a) Since we have formulated, in fact, the problem of approximation of a function, we need to derive the analytical regression dependence F DP D (.) on the parameters ⃗ W . How this function is defined will depend on the quality of the correction of nonlinear distortions; b) A procedure of the searching the parameters ⃗ W is a classical optimization problem, which is a linear or nonlinear regression with respect to the parameters ⃗ W . Finding efficient methods of convex or non-convex optimization is one of the major problems in this problem; c) Compression of a function F DP D (.), i.e. reducing its computational complexity.
One of the methods to solve the problem a) for DPD task is the Volterra functional series.And it is also the conventional tool to characterize the complex nonlinear dynamics in various fields including the radiophysics, mechanical engineering, electronic and electrical engineering, energy sciences (here readers may refer e.g. to review [2]).Volterra series are widely employed to represent the input-output relationship of nonlinear dynamical systems with memory.Volterra power series are among the best-understood nonlinear system representations in signal processing.Such integral functional series (also called Fréchet-Volterra series) (1) were first proposed by Maurice Fréchet for a continuous nonlinear dynamical systems representation [3,4].Here readers may also refer to overview [5] and monograph [6] for more details on relevant Lyapunov-Liechtenstein operator and Lyapunov -Schmidt methods in the theory of non-linear equations.
The role of a reproducing kernel Hilbert space in development of a unifying view of the Volterra theory and polynomial kernel regression is presented in [7].
In (1) x(t) is input signal and y(t) is output of a single input single output (SISO) nonlinear system and K n (s 1 , s 2 , . . ., s n ) are the multidimensional Volterra kernels (or transfer functions) to be identified based on nonlinear system's response y(t) as a reaction on input x(t) (Fig. 3).It is to be noted, that for the basic case n = 1 we have a conventional Finite Impulse Response (FIR) linear model which optimal in the least-squares sense.
x(t) F (x(t)) Behavioral modeling of the black box system Fréchet theorem [3] generalises the famous Weierstrass approximation theorem which characterizes the set of continuous functions on a compact interval via uniform approximation by algebraic polynomials.
Power series (1) characterize the stationary dynamical systems.Stationarity here means that a transfer functions do not vary during the transient process as t ∈ [0, T ].More general power series (2) models nonstationary dynamics when transfer functions depend explicitly on time t The Volterra series is essential tool of mathematical modeling the nonlinear dynamical systems appearing in digital pre-distortion (DPD) iterative process [8].DPD as we described before is an important part of the digital signal processing algorithms used in transmitters and receivers.Several methods have been studied for DPD, with Volterra series-based methods being popular due to their ease of implementation and the straightforward interpretation of their nonlinear terms.The key issue with Volterra series is the curse of dimension: as the order of the series increases, the number of terms involved in the expansion grows exponentially, making it computationally demanding.From other hand, estimating the functional coefficients (Volterra kernels) of the Volterra integral functional series can be challenging.It often considered in its discrete form and requires a significant amount of data and complex optimization algorithms to find the best fit for the model coefficients.Alternative approach based on problem reduction to multi-dimensional integral equations solution [10,11] needs special probe signals design.
In present paper the alternative approach for the Volterra kernels identification is proposed using the direct collocation method.The results are compared with the conventional least squares method (LSM) widely employed for Volterra series identification problem in telecommunication domain.
The rest of the paper is structured as follows: The subsequent section provides the problem statement.Section 3 focuses on the collocation method.Section 4 carries out computational experiments with LSM, while section 5 discusses concluding remarks and future work.

Identification problem statement
Let us consider the following segment of the truncated Volterra series (1) for n = 2 Our current problem in this section is to determine the kernels K 1 (s) and K 2 (s 1 , s 2 ) by a known input and output pair x(t), y(t) .
In contrast to the linear case n = 1, when it is sufficient to specify a single pair x(t), y(t) to determine the kernel K 1 (s), in the nonlinear case n = 2, for the unique identification of the two-dimensional kernel K 2 (s 1 , s 2 ), it is necessary to specify a two-dimensional continuum of equalities.This means that problem (4) has an infinite set of solutions.
Remark A1 It should be noted that if we consider this problem as an integral equation with two unknown functions K 1 (s) and K 2 (s 1 , s 2 ), then this problem is essentially ill-posed.There are an infinite number of solutions, this problem is insufficiently defined.In this regard, no classical numerical methods designed for integral equations are applicable in this case.And as a result, there are no any attempts to solve the problem in this form in the literature.
Remark A2 A fundamentally different situation takes place in the problem of determining an unknown input signal x(t) with a known output signal y(t) after kernels identification.It is to be noted that in this case we have the problem of nonlinear Volterra integral equations' solution.Here readers may refer to sec.9 in book [11], papers [12], [13] and references therein regarding the Kantorovich main solutions and blow up phenomenon.
Within the framework of this paper, from a practical point of view, we will be satisfied with any pair of approximately found kernels K 1 (s) and K 2 (s 1 , s 2 ) that provides a sufficiently small residual norm Denote by B i (t), i = 0, 1, 2, . . ., the basis functions forming a complete orthogonal system of functions on the segment [0, T ].
We look for an approximate solution of the problem (3) in the form of segments of series of expansions according to the selected system of basis functions

Collocation method
Collocation-type methods are widely used in the discretization of various kinds of integro-functional equations [14].With sufficiently good accuracy and stability, they are also computationally less expensive in comparison with projection methods of the Galerkin type requiring additional integration [15].
In order to determine the unknown coefficients A i and C ij , we introduce a uniform grid of nodes where N + 1 is number of nodes.
Substitute ( 5) in (3) and then demand that the equalities be fulfilled at the points ( 6) Denote for a simplicity y(t k ) = y k , and transform the last equalities as follows As a system of basis functions B i (t), i = 0, 1, . .., we choose Chebyshev polynomials of the first kind Since these polynomials are orthogonal on the segment [−1, 1], we apply a linear mapping to the segment [0, T ].
The controlled norm of residual corresponding to the selected values of m, m 1 and m 2 takes the form Let us denote N = m + m 1 m 2 − 1. Number of equalities (number of nodes in the grid) equals to the number of unknown coefficients.
Thus, we have the following system of linear algebraic equations with respect to the unknown coefficients A i , i = 0, 1, . . ., m − 1 and C ij , i = 0, 1, . . ., m 1 − 1, j = 0, 1, . . ., m 2 − 1. Here 4 Least-square method Let us denote N > m + m 1 m 2 − 1.We have the situation where number of equalities is larger than number of unknown coefficients A i and C ij .Thus we have the overdetermined system of linear equations with respect to the unknown coefficients A i , i = 0, 1, . . ., m − 1 and C ij , i = 0, 1, . . ., m 1 − 1, j = 0, 1, . . ., m 2 − 1: where The system is inconsistent.Least-square method is used to find the approximate solution of the system.The point of the method is to find such coefficients A i and C ij such that the following criteria is minimized:

Numerical experiments
Let us illustrate the operation of the proposed identification methods on two pairs of model signals.
The Figure 4 shows the graphs of the input and output signal (16).The Table 1 demonstrates the dependence of the residual ε N on the values m = m 1 = m 2 for the uniform mesh All calculations were performed in the Maple system with parameter Digits:=30 (the number of digits that Maple uses when making calculations with software floating-point numbers).Also note that the integration during the formation of the system (11) was carried out analytically and did not introduce additional error in the calculation results.This is due to the fact that the input signal x(t) in most cases allows analytical calculation of the values (12).In the case of using input signals of a more complex structure, special approximation methods should be applied to the integrals taking into account the possible fast oscillation of x(t).For the simplicity we assume that m = m 1 = m 2 .The Table 2 demonstrates the dependence of the residual ε N on the parameters.All calculations for least-square method were performed in MATLAB.Overdetermined matrix is solved using lsqminnorm function.It also should be noted that all the integrations during calculation were carried out analytically and didn't introduce additional error in the results.(17) The Figure 11 shows the graphs of the input and output signals (17).Let us also discuss the stability of suggested numerical technique.Let the input data of the problem (17) be determined with some random error ε rand varying within the δ value, namely |ε rand | ⩽ δ.Table 4 shows the dependence of the averaged residual ε N on the δ value at a fixed m = 3 based on the results of 10 measurements.It can be seen from the results of the Table 4 that residual continuously depends on the limits of random measurement errors of the input and output signals.Thus, we can conclude about the stability of the suggested method.As for collocation method, let us check the stability of least-square method on this model.For testing stability 10 rounds of experiments were performed and average residual ε N was calculated.Also m = 3, k = (m + m 2 ) • 5 were fixed.

Conclusions
Two numerical approaches to solving the problem of identification of the Volterra model were proposed in the paper.
As can be seen from the presented results, both methods showed stable convergence (in the sense of the tendency of the residual to zero).However, from the point of view of the arithmetic complexity of calculations, the collocation method turns out to be less expensive.And this factor is more pronounced the more parameters of the model are to be determined.This is due to the need to calculate a significantly larger number of integrals proportional to the square of the number of measurements being processed.
Further development of research suggests an increase in the number of terms n in the model ( 1) to identify a more accurate functional relationship between the input and output signals.It is also planned to develop special methods for approximating integrals (12) for the case of using input signals of a more complex structure, including fast oscillating signals.

Table 1 :
Dependence of the residual ε N on the values m, m 1 , m 2 .

Table 2 :
Dependence of the residual ε N on the values m and k.

Table 3 :
Dependence of the residual ε N on the values m, m 1 , m 2 .

Table 4 :
Stability results for collocation

Table 6 :
Stability results for LSM