1. Introduction
The development of the theory of dynamical systems, taking into account the specifics of applied problems, aims to create new mathematical methods. This paper is devoted to the develop mathematical tools for studying inverse problems in the theory of dynamical systems. The work aims to develop a methodology and algorithms for identifying Volterra polynomials (finite segments of Volterra series) [
1].
The Volterra integro-power series is well known in the theory of mathematical modeling of nonlinear dynamic systems of the “input–output” type. However, modern and classical studies in this area do not provide a universal mathematical apparatus for studying problems with restrictions on the dynamic characteristics of systems.
Reference [
2] contains an extensive list of references on methods for identifying nonlinear objects using Volterra integral equations. References [
3,
4,
5,
6,
7] are devoted to methods for constructing dynamic models using Volterra polynomials. Models based on the Volterra theory are used to describe stochastic systems [
8], as well as for the structural identification of nonlinear dynamic systems [
9]. A systematic approach to modeling nonlinear dynamic systems by formalizing the relationship between input
and output
was first implemented by Norbert Wiener [
10]. He applied the Volterra series in the analysis of nonlinear electronic circuits. He developed efficient identification algorithms for the case of an input signal in the form of Gaussian white noise. Wiener’s research was continued in the works of Marmarelis, Schetzen, Rugh, and other researchers (see, for example, the reviews in [
11,
12]). The system responses to test signals in the form of ideal white noise are used to identify the Wiener kernels. In practice, the implementation of such input actions is carried out with inevitable errors, which are compensated by choosing the optimal range in test disturbances [
13]. When solving inverse quantum mechanical problems, researchers use wave functions [
14] to construct Volterra integral models. The identification of Volterra kernels is based on minimizing the root-mean-square error from the response of the dynamic system tested. This approach is associated with the extreme complexity of practical implementation [
15].
In this regard, they strive to achieve a simplification of the methods (see, for example, [
16,
17,
18,
19]). In particular, the authors of [
18] implemented the case where Volterra kernels are assumed to be separable,
as well as the satisfiability of a priori conditions,
Reference [
16] considered a modified discrete analog of the cubic Volterra polynomial.
where the symmetric kernels
and
are defined only on one of the subdomains
and
, respectively. To reduce computational costs, the authors of [
16] proposed a transition from (4) to relations
or
It depends on the statistical properties of the input signals. In this case, they solve the problem of restoring the functions
of one variable instead of the problem of determining in (4) the functions
,
, of many variables in (5) and (6). Moreover, instead of searching for
on the entire domain of definition
, researchers confine themselves to the values of the function at fixed values
,
. In particular, this approach was applied in [
20] (p. 1387) and [
21] (p. 1078). The critical review of [
22] (pp. 178–179) explained the difference between these problems in detail using the approaches described in [
23,
24] as an example.
As noted in [
25], “for the presentation of information in the time domain, the expediency of using pulsed and stepped test signals is obvious”. A method based on the
-functions use was proposed in [
26] and developed later in [
27]. It suggests using the
-parametric family,
where
is the Dirac
-function,
as test actions for identifying the
.
A discrete analog of this approach is the numerical algorithm proposed in [
28]. Note that the technique based on (6) has a limited scope. An explanation for this can be found in [
29] (p. 142): “… this simple idea is impulse-response analysis. Its basic weakness is that many physical processes do not allow pulse inputs… Moreover, such input could make the system exhibit nonlinear effect that would disturb the linearized behavior we have set out to model”. Readers can find a detailed review of identification methods based on impulse disturbances [
27,
30].
Let us now turn to methods based on the application of Heaviside functions
. Reference [
31] considered an approach related to approximating on
a periodic test signal by discretely given stepwise one with a constant quantization step. It is assumed the initial continuous input signal has a constant period
. This technique was further developed in [
32,
33], in which
was used as the test signal for identifying
, where
is the signal amplitude (height), and
is a logical variable equal to zero if
In [
34], a modification was made for a dynamical system with two inputs. Here, the identification process included a heuristic algorithm for dividing the system response
into components due to the influence of a separate integral term of the quadratic Volterra model.
In this paper, we consider dynamic systems, the transient characteristics of which are presented in the time domain. The possibility of scaling in time makes it possible to study fast processes that are typical for many technical (energy) systems. The method of finding the transient characteristics of the system is deterministic. Fewer data are required to formalize the mathematical model in comparison with the probabilistic method. The collection of initial data occurs during the execution of an active experiment, which implies the possibility of influencing the system with test input signals. In comparison with a passive experiment (observation), this method allows one to reduce the time for collecting initial data and specify the type of test signal.
Reference [
3] presented a method for identifying Volterra kernels using a combination of Heaviside functions with a deviating argument as test signals. Its advantage lies in the transition from the original problem to the solution of such special multidimensional Volterra equations of the first kind with variable upper and lower integration limits, which have explicit inversion formulas. The scope of this technique for modeling the dynamics of real-life technical objects is limited by the complexity of the formation of piecewise constant test signals. Reference [
35] considered the possibility of using test signals of a piecewise linear form,
in the problem of identifying a two-dimensional continuum of unknowns from a linear Volterra equation of the first kind with a nonstationary kernel.
Figure 1 shows the form of the input signal (8).
The chosen modification of the input signals simplifies their formation in practice, and the distinguished Volterra integral equations of the first kind, as before, have a unique solution in the class of continuous functions.
The identification method was developed to further apply it for numerical modeling the process of automatic simulation of the nonlinear dynamics of heat and electric power industry objects based on Volterra polynomials with a vector input.
The purpose of this work is, firstly, to use the reserve for increasing the accuracy of constructing an integral model, presented as a modified quadratic Volterra polynomial, through the use of piecewise linear signals close to real-life dynamic systems, and secondly, to develop measurement noise-resistant algorithms for identifying functions two variables.
The paper is organized as follows:
Section 2 describes the technique for building an integral model using piecewise linear test signals. It also presents an example illustrating the effect of increasing the accuracy of modeling the linear term by applying piecewise linear signals.
Section 3 contains a numerical algorithm for identifying the quadratic term of the Volterra series based on smoothing cubic splines.
Section 4 considers the implementation of the numerical solution algorithm using the quadrature method.
Section 5 suggests directions for future work.
Section 6 contains the main results.
2. Method for Constructing a Quadratic Volterra Polynomial
Let us consider a quadratic model containing a linear nonstationary component,
To identify the Volterra kernels
,
,
,
, the authors of [
36] used test signals
where
.
Figure 2 shows the form of the input signal (10) when the signal amplitude is equal to 1.
Substituting (10) in (9) leads to the following system:
where
,
, which implies that
where
Let us carry out the procedure for identifying the Volterra kernel
symmetric in variables
, using Equations (13) and (15). Then the problem of identifying
from (9) reduces to solving
where
is known. Applying test signals (8) in addition to (10), we obtain Equation (16), where
which can be represented in the form
Here,
is the response of a dynamic object to a signal (8) at
. Following [
35,
37], the inversion Formula (17) has the form
Let us compare the effect of using test signals (8) and (10) when building an integral model (9).
The below example demonstrates the effect of increasing the simulation accuracy when using test signals of the form (8). Let the “reference” dynamical system be represented by a cubic Volterra polynomial with kernels
,
,
, so that
The technique for constructing quadratic and cubic Volterra polynomials, based on the use of piecewise constant test signals of type (10), has been successfully tested on dynamic systems of various physical nature, including a mathematical model of type (19), as well as in modeling the dynamics of a heat exchanger element and wind power plant [
38]. Note that (19) is a partial sum of the series for the function
This function has proven itself well in the study of the areas of applicability of identification algorithms for quadratic and cubic Volterra polynomials [
38,
39]. We apply the procedure for identifying kernels by using test signals (10) with amplitudes
and, instead of (9), obtain
where the Volterra kernels were restored using Equations (12) and (13), respectively.
The combined model (9) with the addition to (10) test signals (8) with amplitude
for identification
has the form
where the kernel identification was performed using Equations (18) and (13), respectively. On signals
,
,
, model (20) gives residual
and model (21) gives residual
where
is the response (19) to signal
.
Let us present an algorithm for constructing the polynomial (9) for modeling the response of the dynamic system represented in the form (19).
Step 1. Calculation of the values of and using substitution (10) with amplitude into the right-hand side of (19).
Step 2. Calculation by (15) of the values of the right-hand side of the integral equation,
Step 3. Application of Equation (13) for identifying , .
Step 4. Calculation of values using substitution (8) with an amplitude into the right-hand side of (19).
Step 5. Calculation of the right-hand side of (17) , where and are obtained in the previous steps 3 and 4, respectively.
Step 6. Application of Equation (18) for identifying , .
Step 7. Substitution of kernels and obtained in steps 3 and 6, respectively, into the right-hand side of (9). This leads to (21).
Modeling accuracy
was compared with response
. The value of the “mean absolute error” coefficient was chosen as a criterion for modeling accuracy.
In
Figure 3, black color shows the areas of fulfillment of the inequality
for
with an accuracy of
.
The computational experiment showed that the areas of efficiency of the integral models (20) and (21) depend on the length of the segment , the amplitude of the test signals used to identify the Volterra kernels, and the accuracy of the calculations .
Note that we assumed the quadratic term, the two-dimensional kernel , in Equation (18) to be known. Therefore, in the next section, we consider an algorithm for identifying this term using Equation (13).
3. Identification Algorithm for Quadratic Term
Unfortunately, the implementation of the obtained inversion Equation (13) in practice faces a fundamental difficulty: the differentiation operation is an ill-posed one [
40]. One of the manifestations of ill-posedness is large errors in calculating the derivative, even for very small errors in specifying a differentiable function. Note that the operation of subtraction in (15) of the registration errors of two functions leads to an increase in the variance of the total error in setting the function
. Thus, stable differentiation of noisy data becomes an urgent problem for the implementation of formula (13) in practice.
Reference [
41] constructed a stable identification algorithm on the basis of Equation (12) (a stable identification algorithm is an algorithm in which the relative identification error is comparable to the relative error of the initial data). There, a smoothing cubic spline (SCS) of a defect unit was used for a stable calculation of the first derivative. The smoothing parameter was chosen from the condition of the minimum root-mean-square smoothing error. The use of smoothing splines becomes much more complicated in the case of identifying the quadratic kernel
. First, to calculate the second-order mixed derivative
, we need to build a smoothing bicubic spline (SBS), which is a function of two variables
,
. Secondly, the boundary conditions are now given not at two extreme points of the SCS construction interval, but on four straight lines, which are the boundaries of the rectangular area of the SCS construction. Thirdly, due to the different “smoothness” of the function
in different variables, we now have to choose two smoothing parameters from the condition for the minimum smoothing error. These difficulties caused the main problems that were not solved in the corresponding scientific publications and which are addressed in this section.
Suppose that the values of the function
are determined at the nodes of a rectangular grid. To take into account possible errors (noise) of measurements, the following representation of noisy measurements
is taken:
where
is random measurement noise with zero mean value and variance
(equally accurate measurements). Note that nodes
and
may not have the same or equal steps. It is required to calculate the values of derivatives
,
at the given nodes from the initial data
.
For a stable calculation of these derivatives, we turn to SCS [
42] widely used in the processing of experimental data [
43,
44]. Suppose we have
nodes
at some interval
. In these nodes, the values of the function (signal)
are measured as follows:
where
is the random measurement noise with zero mean and variance
(equally accurate measurements). The smoothing cubic spline
of a defect unit on each segment
can be represented by a cubic polynomial of the following form [
42]:
Moreover, the function must be twice continuously differentiable on the entire interval of its definition. Note that, in contrast to the interpolation spline (passing through the points ), the smoothing cubic spline generally does not pass through these points, but passes more “smoothly” in some neighborhoods of these points (depending on the smoothing parameter ), thereby providing smoothing (filtering) of measurement noise.
To uniquely calculate the spline coefficients
,
,
,
, boundary conditions are set at the nodes
,
. The following conditions are most often used [
42,
44]:
conditions on zero second derivatives of the spline (natural boundary conditions),
conditions on the first derivatives of the spline,
as well as a combination of these conditions (for example, condition (25) is on the left, condition (24) is on the right). It was shown [
42] the SCS constructed under these conditions provides a minimum to the functional
where
denotes the weight factors reflecting the accuracy of the
j-th measurement
(they are given the same in the case of equally accurate measurements).
To calculate the spline coefficients (for a given smoothing parameter), it is necessary to compose a system of linear algebraic equations with a five-diagonal matrix concerning some vector (as a rule, these are the values of the second derivative of the spline at the nodes
), through which all the spline coefficients are then found (for details, see [
42,
44]).
The smoothing parameter
“controls” the smoothness of the spline, and the smoothing error (as well as the differentiation error) depends significantly on the value of this parameter [
44,
45]. There is a parameter value (let us call it optimal) for which the smoothing error (in the accepted norm) is minimal [
45]. Let us temporarily assume that we have found an acceptable (in terms of the minimum smoothing error) value of the smoothing parameter (the choice of the parameter is discussed in the next section).
Remark 1. It follows from the form of the integrals (11) that the functiontakes nonzero values for the arguments satisfying the condition. For other values of,, the function is equal to zero due to the condition of the technical feasibility of the system with negative values of the arguments, i.e.,, if,.
To eliminate the discontinuity of the first kind at
values when constructing a smoothing spline, we propose to supplement the values of the function
for
according to the following rule:
We denote the function supplemented in this way as .
Initially, we focus on the algorithm for calculating the values of the derivative . It can be represented by the following steps:
Step 1. We set the boundary conditions, the combination of which at the extreme points , of the construction interval is determined on the basis of available a priori information about the function . If such reliable information is not available, then one should turn to the natural boundary conditions (24).
Step 2. For each
, we form a dataset
select the smoothing parameter
, and build the SCS
, from which we then calculate the first derivative
(an estimate of the derivative
), where
is the coefficient of spline
in representation (23).
Step 3. For each Y, we again form the dataset
select the smoothing parameter
, and build the SCS
, the first derivative of which is the estimate
for the second derivative
, where
is the coefficient of spline
in representation (23).
Thus, we calculate estimates of the second derivative for , .
Let us proceed to the construction (following the technique of [
46]) of a bicubic smoothing spline for calculating the mixed derivative
. We use the following algorithm:
Step 1. For each
, we again form a dataset (fix the value of
)
select the smoothing parameter
, build the SCS
, from which we then calculate the first derivative
(estimation of the derivative
), where
is the coefficient of spline
in representation (23).
Step 2. For each Y, we form a dataset
select a smoothing parameter
, build an SCS
, the first derivative of which is an estimate
for the mixed derivative
, where
is the coefficient of spline
in representation (23).
Thus, we repeat step 1 for
, and step 2 for
. After calculating the estimates
,
using Equation (13), we find the estimate
for the values
.
Remark 2. The inversion Equation (13) determines the value of the quadratic kernelfor the arguments,i.e., for the values of the argument. The lineis the axis of symmetry of the kernel(follows from the one-dimensionality of the input signal); therefore, to determine the values of the kernel for, where, we propose a symmetrical supplement of the kernel values according to the formula.
Remark 3. Since the construction of the SCS by the variablerequires approximatelyarithmetic operations, where [
42]
, the proposed algorithm for calculating derivatives requires approximately operations. Therefore, the proposed algorithms for calculating derivatives have a high computational efficiency even with a large dimension of the grid .
Previously, the values of the smoothing parameters
,
,
,
selected were assumed (i.e., determined). Therefore, the question of how to choose these parameters arises, which will significantly affect the error of smoothing and differentiation. If the variance
of the measurement noise (see (22)) were reliably known (at least with an accuracy of 5–8%), then the selection algorithm constructed on the basis of checking the optimality criterion of the linear filtering algorithm would allow, with acceptable accuracy (5–8%), to estimate the values of the optimal smoothing parameter that minimizes the value of the root-mean-square smoothing error (see [
44] (pp. 60–67), [
45]). It is obvious that the situation with unknown noise dispersion is most characteristic in solving practical identification problems. Therefore, to choose a parameter in this case, we turn to the L-curve method used to choose the regularization parameter in algorithms for solving linear ill-posed problems (for example, [
47,
48]). In [
49], a modification of the L-curve method was proposed for choosing the smoothing parameter.
Let us talk briefly about the essence of this selection algorithm. Let us introduce the following functionals (see [
49]):
Then, an L-curve (whose shape resembles the outline of the Latin letter L) is a parametric curve with coordinates
. It can be shown that the curvature of an L-curve is given by the following formula:
where
,
. The smoothing parameter is the value
for which the curvature
takes on the maximum value. To effectively calculate the value of the functional
, the following formula is proposed:
where
,
,
are the SCS coefficients in representation (23), calculated for a given parameter
. To calculate the curvature value using Equation (27), an approach is proposed that uses cubic interpolation splines to approximate the dependences
,
(for details, see [
49]). An extensive computational experiment was also carried out there to answer the following question: Is the loss due to smoothing error large when
is used instead of the optimal
(which can only be determined in a computational experiment)? The experiment was carried out with functions that are “typical” output signals of a dynamic system when step signals are applied to the input. The analysis of the results of the experiment showed that the algorithm for selecting the smoothing parameter on the basis of the L-curve method makes it possible to estimate the optimal value of the smoothing parameter quite well. The increase in the smoothing error when using the parameter
does not exceed 5–15% on average compared to
, the calculation of which is impossible in practice. Therefore, to calculate the smoothing parameters
,
,
, and
, it is proposed to use the described algorithm for choosing the smoothing parameter on the basis of the L-curve method.
To test the proposed algorithm of identifying quadratic kernel, a numerical experiment was carried out, some of the results of which we present in this paper. The test quadratic kernel
is a function used to describe the dynamics of some type of heat exchangers [
50].
Figure 4a shows the surface of this function, and
Figure 4b shows isolines. The time interval boundary was
, while the number of nodes was
,
.
First, we define the methodological error of the identification algorithm. To do this, we calculated the values of the function (15) at the nodes
,
, which were interpreted as the exact values of the function
. These data, presented as a matrix
with dimensions
with elements
, were the initial data for the proposed identification algorithm. Since these initial data were taken as exact, instead of SCS, we built interpolating cubic splines (including the bicubic spline) with boundary conditions (24). We calculated estimates for the derivatives
and
on the basis of these splines and then constructed an estimate for the quadratic kernel using Equation (9) (see Remark 2).
Figure 5 shows the isolines of this estimate, having a relative identification error
, where
are matrices composed of the values of the exact kernel
and its estimates
, respectively, and
is the Euclidean norm of the matrix. Approximately the same error was observed for other grid sizes in
,
. Therefore, we can conclude the proposed identification algorithm has a low methodological error.
Let us consider the influence of the measurement noise of the function
on the accuracy of identification. To do this, we distorted all elements of the “exact” matrix
with normally distributed noise with a relative level
, where
is a matrix with “noisy” elements. The matrix
thus formed was used as initial data for the previously described identification algorithm. We chose the smoothing parameter at all steps of calculating derivatives using the L-curve method described above.
Figure 6 shows the isolines of the estimate
, built at a noise level of 0.02. The relative identification error was
, which indicates the acceptable accuracy of quadratic term identification by the proposed algorithm.