Next Article in Journal
Simplified Analytical Solution of the Contact Problem on Indentation of a Coated Half-Space by a Conical Punch
Previous Article in Journal
Cohomology of Homotopy Colimits of Simplicial Sets and Small Categories
Previous Article in Special Issue
Improved Conditions for Oscillation of Functional Nonlinear Differential Equations
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Unbiased Least-Squares Modelling

Department of Mathematics, “Tullio Levi Civita”, University of Padova, Via Trieste 63, 35131 Padova, Italy
*
Author to whom correspondence should be addressed.
Mathematics 2020, 8(6), 982; https://doi.org/10.3390/math8060982
Submission received: 25 May 2020 / Revised: 9 June 2020 / Accepted: 11 June 2020 / Published: 16 June 2020
(This article belongs to the Special Issue Multivariate Approximation for solving ODE and PDE)

Abstract

:
In this paper we analyze the bias in a general linear least-squares parameter estimation problem, when it is caused by deterministic variables that have not been included in the model. We propose a method to substantially reduce this bias, under the hypothesis that some a-priori information on the magnitude of the modelled and unmodelled components of the model is known. We call this method Unbiased Least-Squares (ULS) parameter estimation and present here its essential properties and some numerical results on an applied example.

1. Introduction

The well known least-squares problem [1], very often used to estimate the parameters of a mathematical model, assumes an equivalence between a matrix-vector product A x on the left, and a vector b on the right hand side: the matrix A is produced by the true model equations, evaluated at some operating conditions, the vector x contains the unknown parameters and the vector b are measurements, corrupted by white, Gaussian noise. This equivalence cannot be satisfied exactly, but the least-squares solution yields a minimum variance, maximum likelihood estimate of the parameters x, with a nice geometric interpretation: the resulting predictions A x are at the minimum Euclidean distance from the true measurements b and the vector of residuals is orthogonal w.r.t. the subspace of all possible predictions.
Unfortunately, each violation of these assumptions produces in general a bias in the estimates. Various modifications have been introduced in the literature to cope with some of them: mainly, colored noise on b and/or A due to model error and/or colored measurement noise. The model error is often assumed as an additive stochastic term in the model, e.g., error-in-variables [2,3], with consequent solution methods like Total Least-Squares [4] and Extended Least-Squares [5], to cite a few. All these techniques let the model to be modified to describe, in some sense, the model error.
Here, instead, we assume that the model error depends from deterministic variables in a way that has not been included in the model, i.e., we suppose to use a reduced model of the real system, as it is often the case in applications. In this paper we propose a method to cope with the bias in the parameter estimates of the approximate model by exploiting the geometric properties of least-squares and using small additional a-priori information about the norm of the modelled and un-modelled components of the system response, available with some approximation in most applications. To eliminate the bias on the parameter estimates we perturb the right-hand-side without modifying the reduced model, since we assume it describes accurately one part of the true model.

2. Model Problem

In applied mathematics, physical models are often available, usually rather precise at describing quantitatively the main phenomena, but not satisfactory at the level of detail required by the application at hand. Here we refer to models described by differential equations, with ordinary and/or partial derivatives, commonly used in engineering and applied sciences. We assume, therefore, that there are two models at hand: a true, unknown model M and an approximate, known model M a . These models are usually parametric and they must be tuned to describe a specific physical system, using a-priori knowledge about the application and experimental measurements. Model tuning, and in particular parameter estimation, is usually done with a prediction error minimization criterion that makes the model response to be a good approximation of the dynamics shown by the measured variables used in the estimation process. Assuming that the true model M is linear in the parameters that must be estimated, the application of this criterion brings to a linear least-squares problem:
x ¯ = argmin x R n A x f ¯ 2 ,
where, from here on, · is the Euclidean norm, A R m × n is supposed full rank rank ( A ) = n , m n , x ¯ R n × 1 , A x are the model response values and f ¯ is the vector of experimental measurements. Usually the measured data contain noise, i.e., we measure f = f ¯ + ϵ , with ϵ a certain kind of additive noise (e.g., white Gaussian). Since we are interested here in algebraic and geometric aspects of the problem, we suppose ϵ = 0 and set f = f ¯ . Moreover, we assume ideally that f ¯ = A x ¯ holds exactly. Let us consider also the estimation problem for the approximate model M a :
x = argmin x R n a A a x f ¯ 2 ,
where A a R m × n a , x R n a × 1 , with n a < n . The choice of the notation for x is to remind that the least-squares solution satisfies A a x = P A a ( f ) = : f , where f is the orthogonal projection of f ¯ on the subspace generated by A a , and the residual A a x f ¯ is orthogonal to this subspace. Let us suppose that A a corresponds to the first n a columns of A, which means that the approximate model M a is exactly one part of the true model M , i.e., A = [ A a , A u ] and so the solution x ¯ of (1) can be decomposed in two parts such that
A x ¯ = [ A a , A u ] x ¯ a x ¯ u = A a x ¯ a + A u x ¯ u = f ¯ .
This means that the model error corresponds to an additive term A u x ¯ u in the estimation problem.
Note that the columns of A a are linearly independent since A is supposed to be of full rank. We do not consider the case in which A a is rank-deficient, because it would mean that the model is not well parametrized. Moreover, some noise in the data is sufficient to determine a full rank matrix.
For brevity, we will call A the subspace generated by the columns of A and A a , A u the subspaces generated by the columns of A a , A u respectively. Note that if A a and A u were orthogonal, decomposition (3) would be orthogonal. However, in the following we will consider the case in which the two subspaces are not orthogonal, as it commonly happens in practice. Oblique projections, even if not as common as orthogonal ones, have a large literature, e.g., [6,7].
Now, it is well known and easy to demonstrate that, when we solve problem (2) and A u is not orthogonal to A a , we get a biased solution, i.e., x x ¯ a :
Lemma 1.
Given A R m × n with n 2 and A = [ A a , A u ] , and given b R m × 1 I m ( A a ) , call x the least-squares solution of (2) and x ¯ = [ x ¯ a , x ¯ u ] the solution of (1) decomposed as in (3). Then
(i) 
if A u A a then x = x ¯ a ,
(ii) 
if A u A a then x x ¯ a .
Proof. 
The least-squares problem A x = f boils down to finding x such that A x = P A a ( f ) . Let us consider the unique decomposition of f on A a and A a as f = f + f with f = P A a ( f ) and f = P A a ( f ) . Call f = f a + f u the decomposition on A a and A u , hence there exist two vectors x a R n a , x u R n n a such that f a = A a x a and f u = A u x u . If A u A a then the two decompositions are the same, hence f = f a and so x = x ¯ a . Otherwise, for the definition of orthogonal projection ([6], third point of Def at page 429), it must hold x x ¯ a . □

3. Analysis of the Parameter Estimation Error

The aim of this paper is to propose a method to decrease substantially the bias of the solution of the approximated problem (2), with the smallest additional information about the norms of the model error and of the modelled part responses.
In this section we will introduce sufficient conditions to remove the bias and retrieve the true solution in a unique way, as summarized in Lemma 4. Let us start with a definition.
Definition 1
(Intensity Ratio). The intensity ratio I f between modelled and un-modelled dynamics is defined as
I f = A a x a A u x u .
In the following we assume that a good approximation of this intensity ratio is available and that its magnitude is sufficiently big, i.e., we have an approximate model that is quite accurate. This information about the model error will be used to reduce the bias, as shown in the following sections. Moreover we will consider also the norm N f = A a x a (or, equivalently, the norm A u x u ).

3.1. The Case of Exact Knowledge about I f and N f

Here we assume, initially, to know the exact values of I f and N f , i.e.,
N f = N ¯ f = A a x ¯ a , I f = I ¯ f = A a x ¯ a A u x ¯ u .
This ideal setting is important to figure out the problem also with more practical assumptions. First of all, let us show a nice geometric property that relates x a and f a under a condition like (4).
Lemma 2.
The problem of finding the set of x a R n that give a constant, prescribed value for I f and N f is equivalent to that of finding the set of f a = A a x a A a of the decomposition f = f a + f u (see the proof of Lemma 1) lying on the intersection of A a and the boundaries of two n-dimensional balls in R n . In fact, it holds:
N f = A a x a I f = A a x a A u x u f a B n ( 0 , N f ) f a B n ( f , T f ) with T f : = N f I f 2 f 2 .
Proof. 
For every x a R n a holds,
N f = f a = A a x a I f = f a f u = N f f u + f u = N f f 2 + f A a x a 2 = N f f 2 + f f a 2
f a = N f f f a = N f I f 2 f 2 = : T f ,
where we used the fact that f u = f u + f u with f u : = P A a ( f u ) = f , f u : = P A a ( f u ) = A a δ x a = f A a x a , and δ x a = ( x x a ) . Hence the equivalence (5) is proved. □
Given I f and N f , we call the feasible set of accurate model responses all the f a that satisfy the relations (5). Now we will see that Lemma 2 allows us to reformulate problem (2) in the problem of finding a feasible f a that, replaced to f ¯ in (2), gives as solution an unbiased estimate of x ¯ a . Indeed, it is easy to note that A a x ¯ a belongs to this feasible set. Moreover, since f a A a , we can reduce the dimensionality of the problem and work on the subspace A a which has dimension n a , instead of the global space A of dimension n. To this aim, let us consider U a the matrix of the SVD decomposition of A a , A a = U a S a V a T , and complete its columns to an orthonormal basis of R n to obtain a matrix U. Since the vectors f a , f R n belong to the subspace A a , the vectors f ˜ a , f ˜ R n defined such that f a = U f ˜ a and f = U f ˜ must have zeros on the last n n a components. Since U has orthonormal columns, it preserves the norms and so f = f ˜ and f a = f ˜ a . If we call f ^ a , f ^ R n a the first n a components of the vectors f ˜ a , f ˜ (which have again the same norms of the full vectors in R n ) respectively, we have
f ^ a B n a ( 0 , N f ) , f ^ a B n a ( f , T f ) .
In this way the problem depends only on the dimension of the known subspace, i.e., the value of n a , and does not depend on the dimensions m n a and n > n a . From (8) we can deduce the equation of the ( n a 2 ) -dimensional boundary of an ( n a 1 ) -ball to which the vector f a = A a x a must belong. In the following we discuss the various cases.

3.1.1. Case n a = 1

In this case, we have one unique solution when both conditions on I f and N f are imposed. When only one of these two is imposed, two solutions are found, shown in Figure 1a,c. Figure 1b shows the intensity ratio I f .

3.1.2. Case n a = 2

Consider the vectors f ^ a , f ^ R n a = 2 as defined previously, in particular we are looking for f ^ a = [ ξ 1 , ξ 2 ] R 2 . Hence, conditions (8) can be written as
ξ 1 2 + ξ 2 2 = N f 2 ( ξ 1 f ^ ξ 1 ) 2 + ( ξ 2 f ^ ξ 2 ) 2 = T f 2 F : ( f ^ ξ 1 ) 2 2 f ^ ξ 1 ξ 1 + ( f ^ ξ 2 ) 2 2 f ^ ξ 2 ξ 2 = N f 2 T f 2 ,
where the right equation is the ( n a 1 ) = 1 -dimensional subspace (line) F obtained subtracting the first equation to the second. This subspace has to be intersected with one of the beginning circumferences to obtain the feasible vectors f ^ a , as can be seen in Figure 2a and its projection on A a in Figure 2b. The intersection of the two circumferences (5) can have different solutions depending on the value of ( N f f ) T f . When this value is strictly positive there are zero solutions, this means that the estimates of I f and N f are not correct: we are not interested in this case because we suppose the two values to be sufficiently well estimated. When the value is strictly negative there are two solutions, that coincide when the value is zero.
When there are two solutions, we have no sufficient information to determine which one of the two solutions is the true one, i.e., the one that gives f a = A a x ¯ a : we cannot choose the one that has minimum residual, neither the vector f a that has the minimum angle with f, because both solutions have the same values of these two quantities. However, since we are supposing the linear system to be originated by an input/output system, where the matrix A a is a function also of the input and f are the measurements of the output, we can take two tests with different inputs. Since all the solution sets contain the true parameter vector, we can determine the true solution from their intersection, unless the solutions of the two tests are coincident. The condition for coincidence is expressed in Lemma 3.
Let us call A a , i R n × n a the matrix of the test i = 1 , 2 , to which correspond a vector f i . The line on which lie the two feasible vectors f a of the same test i is F i and S i = A a , i F i is the line through the two solution points. To have two tests with non-coincident solutions, we need that these two lines S 1 , S 2 do not have more than one common point, that in the case n a = 2 is equivalent to S 1 S 2 , i.e., A a , 1 F 1 A a , 2 F 2 , i.e., F 1 A a , 1 A a , 2 F 2 = : F 12 . We represent the lines F i by means of their orthogonal vector from the origin f o r t , i = l o r t , i f i f i . We introduce the matrices C a , C f , C f p such that A a , 2 = C a A a , 1 , f 2 = C f f 1 , f 2 = C f p f 1 and k f such that f 2 = k f f 1 .
Lemma 3.
Consider two tests i = 1 , 2 from the same system with n a = 2 with the above notation. Then it holds F 1 = F 12 if and only if C a = C f p .
Proof. 
From the relation f i = P A a , i ( f i ) = A a , i ( A a , i T A a , i ) 1 A a , i T f i , we have
f 2 = A a , 2 ( A a , 2 T A a , 2 ) 1 A a , 2 T f 2 = C a A a , 1 ( A a , 1 T C a T C a A a , 1 ) 1 A a , 1 T C a T C f f 1 .
It holds F 1 = F 12 f o r t , 1 = f o r t , 12 : = A a , 1 A a , 2 f o r t , 2 , hence we will show this second equivalence. We note that l o r t , 2 = k f l o r t , 1 and calculate
f o r t , 12 = A a , 1 A a , 2 f o r t , 2 = A a , 1 A a , 1 C a l o r t , 2 f 2 f 2 = A a , 1 A a , 1 C a k f l o r t , 1 C f p f 1 k f f 1 = A a , 1 A a , 1 C a C f p f o r t , 1 .
Now let us call s o r t , 1 the vector such that f o r t , 1 = A a , 1 s o r t , 1 , then, using the fact that C a = C f p we obtain
f o r t , 12 = A a , 1 A a , 1 C a C f p A a , 1 s o r t , 1 = A a , 1 ( A a , 1 A a , 1 ) s o r t , 1 = ( s i n c e A a , 1 A a , 1 = I n a ) = A a , 1 s o r t , 1
Hence we have F 12 = F 1 A a , 1 A a , 1 C a C f p f o r t , 1 = f o r t , 1 C a C f p = I .

3.1.3. Case n a 3

More generally, for the case n a 3 , consider the vectors f ^ a , f ^ R n a as defined previously, in particular we are looking for f ^ a = [ ξ 1 , , ξ n a ] R n a . Conditions (8) can be written as
i = 1 n a ξ i 2 = N f 2 i = 1 n a ( ξ i f ^ ξ i ) 2 = T f 2 F : i = 1 n a ( ( f ^ ξ i ) 2 2 f ^ ξ i ξ i ) = N f 2 T f 2 ,
where the two equations on the left are two ( n a 1 ) -spheres, i.e., the boundaries of two n a -dimensional balls. Analogously to the case n a = 2 , the intersection of these equations can be empty, one point or the boundary of a ( n a 1 ) -dimensional ball (with the same conditions on ( N f f ) T f ). The equation on the right of (13) is the ( n a 1 ) -dimensional subspace F on which lies the boundary of the ( n a 1 ) -dimensional ball of the feasible vectors f a , and is obtained subtracting the first equation to the second one. In Figure 3a the graphical representation of the decomposition f = f a + f u for the case n a = 3 is shown, and in Figure 3b the solution ellipsoids of 3 tests whose intersection is one point. Figure 4a shows the solution hyperellipsoids of 4 tests whose intersection is one point, in the case n a = 4 .
We note that, to obtain one unique solution x a we must intersect the solutions of at least two tests. Let us give a more precise idea of what happens in general. Given i = 1 , , n a tests we call, as in the previous case, f o r t , i the vector orthogonal to the ( n a 1 ) -dimensional subspace F i that contains the feasible f a , and S i = A a , i F i . We project this subspace on A a , 1 and obtain F 1 i = A a , 1 A a , i F i that we describe through its orthogonal vector f o r t , 1 i = A a , 1 A a , i f o r t , i . If the vectors f o r t , 1 , f o r t , 12 , f o r t , 1 n a are linearly independent, it means that the ( n a 1 ) -dimensional subspaces F 1 , F 12 , F 1 n a intersect themselves in one point. In Figure 4b it is shown an example in which, in the case n a = 3 the vectors f o r t , 1 , f o r t , 12 , f o r t , 13 are not linearly independent. The three solution sets of this example will intersect in two points, hence, for n a = 3 , three tests are not always sufficient to determine a unique solution.
Lemma 4.
For all n a > 1 , the condition that, given i = 1 , , n a tests, the n a hyperplanes S i = A a , i F i previously defined have linearly independent normal vectors is sufficient to determine one unique intersection, i.e., one unique solution vector x ¯ a , that satisfies the system of conditions (4) for each test.
Proof. 
The intersection of n a independent hyperplanes in R n a is a point. Given a test i and S i = A a , i F i the affine subspace of that test
S i = v i + W i = { v i + w R n a : w · n i = 0 } = { x R n a : n i T ( x v i ) = 0 } ,
where n i is the normal vector of the linear subspace and v i the translation with respect to the origin.
The conditions on S i relative to n a tests correspond to a linear system A x = b , where n i is the i-th row of A and each component of the vector b given by b i = n i T v i . The matrix A has full rank because of the linear independence condition of the vectors n i , hence the solution of the linear system is unique.
The unique intersection is due to the hypothesis of full column rank of the matrices A a , i : this condition implies that the matrices A a , i map the surfaces F i to hyperplanes S i = A a , i F i . □
For example, with n a = 2 (Lemma 3) this condition is equal to considering two tests with non-coincident lines S 1 , S 2 , i.e., two non-coincident F 1 , F 12 .

3.2. The Case of Approximate Knowledge of I f and N f Values

Let us consider N tests and call I f , i , N f , i and T f , i the values as defined in Lemma 2, relative to test i. Since the system of conditions
N f , i = A a , i x a I f , i = A a , i x a z i A a , i x a and N f , i = A a , i x a T f , i = f i A a , i x a
is equivalent, as shown in Lemma 2, we will take into account the system on the right for its simplicity: the equation on T f , i represents an hyperellipsoid, translated with respect to the origin.
In a real application, we can assume to know only an interval in which the true values of I f is contained and, analogously, an interval for N f values. Supposing we know the bounds on I f and N f , then the bounds on T f can be easily computed. Let us call these extreme values N f m a x , N f m i n , T f m a x , T f m i n , we will assume it always holds
N f m a x m a x i ( N f , i ) , N f m i n m i n i ( N f , i ) , and T f m a x m a x i ( T f , i ) , T f m i n m i n i ( T f , i ) ,
for each i-th test of the considered set i = 0 , , N .
Condition (4) is now relaxed as follows: the true solution x ¯ a satisfies
A a , i x ¯ a N f m a x , A a , i x ¯ a N f m i n , and A a , i x ¯ a f i T f m a x , A a , i x ¯ a f i T f m i n ,
for each i-th test of the considered set i = 0 , , N .
Assuming the extremes to be non-coincident ( N f m i n N f m a x and T f m i n T f m a x ), these conditions do not define a single point, i.e., the unique solution x ¯ a (as in (4) of Section 3.1), but an entire closed region of the space that may be even not connected, and contains infinite possible solutions x different from x ¯ a .
In Figure 5 two examples, with n a = 2 , of the conditions for a single test are shown: on the left in the case of exact knowledge of the N f , i and T f , i values, and on the right with the knowledge of two intervals containing the right values.
Given a single test, the conditions (16) on a point x can be easily characterized. Given the condition
f a = A a x a = N f ,
we write x a = χ i v i with v i the vectors of the orthogonal basis, given by the columns V of the SVD decomposition A a = U S V T . Then
f a = A a x a = U S V T ( i χ i v i ) = U S ( i χ i e i ) = U ( i s i χ i e i ) = i s i χ i u i .
Since the norm condition f a 2 = i ( s i χ i ) 2 = N f 2 holds, then we obtain the equation of the hyperellipsoid for x a as:
i ( s i χ i ) 2 = i χ i 2 ( 1 s i ) 2 = N f 2 .
The bounded conditions hence gives the region inside the two hyperellipsoids centered in the origin:
N f m i n i χ i 2 ( 1 s i ) 2 N f m a x .
Analogously for the I f condition, the region inside the two translated hyperellipsoids:
T f m i n i χ i 2 ( 1 s i ) 2 f T f m a x .
Given a test i, each of the conditions (18) and (19), constrain x ¯ a to lie inside a thick hyperellipsoid, i.e., the region between the two concentric hyperellipsoids. The intersection of these two conditions for test i is a zero-residual region that we call Z r i
Z r i = { x R n a | ( 18 ) and ( 19 ) hold } .
It is easy to verify that if N f , i is equal to the assumed N f m i n or N f m a x , or T f , i is equal to the assumed T f m i n or T f m a x , the true solution will be on a border of the region Z r i , and if it holds for both N f , i and T f , i it will lie on a vertex.
When more tests i = 1 , , N are put together, we have to consider the points that belong to the intersection of all these regions Z r i , i.e.,
I z r = i = 0 , , N Z r i .
These points minimize, with zero residual, the following optimization problem:
min x i = 1 N m i n ( 0 , A a , i x N f m i n ) 2 + i = 1 N m a x ( 0 , A a , i x N f m a x ) 2 + + i = 1 N m i n ( 0 , A a , i x f i T f m i n ) 2 + i = 1 N m a x ( 0 , A a , i x f i T f m a x ) 2 .
It is also easy to verify that, if the true solution lies on an edge/vertex of one of the regions Z r i , it will lie on an edge/vertex of their intersection.
The intersected region I z r tends to monotonically shrink in a way that depends from the properties of the added tests. We are interested to study the conditions that make it reduce to a point, or at least to a small region. A sufficient condition to obtain a point is given in Theorem 1.
Let us first consider the function that, given a point in the space R n a , returns the squared norm of its image through the matrix A a :
N f 2 ( x ) = A a x 2 2 = U Σ V T x 2 2 = Σ V T x 2 2 = ( Σ V T x ) T ( Σ V T x ) = x T ( V Σ T Σ V T ) x = = | | σ 1 v 1 T x σ 2 v 2 T x | | 2 2 = σ 1 2 ( v 1 T x ) 2 + σ 2 2 ( v 2 T x ) 2 + ,
where v i are the columns of V and x = [ x ( 1 ) x ( 2 ) , x ( n a ) ] .
The direction of maximum increase of this function is given by its gradient
N f 2 ( x ) = 2 ( V Σ 2 V T ) x = 2 σ 1 2 v 1 T x v 1 ( 1 ) + 2 σ 2 2 v 2 T x v 2 ( 1 ) + + 2 σ n a 2 v n a T x v n a ( 1 ) 2 σ 1 2 v 1 T x v 1 ( 2 ) + 2 σ 2 2 v 2 T x v 2 ( 2 ) + + 2 σ n a 2 v n a T x v n a ( 2 ) .
Analogously, define the function T f 2 ( x ) as
T f 2 ( x ) = A a x f 2 2 = U Σ V T x f 2 2 = Σ V T x f 2 2 = = ( Σ V T x f ) T ( Σ V T x f ) = ( Σ V T x ) T ( Σ V T x ) 2 ( Σ V T x ) T f + ( f ) T ( f ) = x ( V Σ 2 V T ) x 2 ( x ) T V Σ f + ( f ) T ( f ) = = σ 1 v 1 T x σ 2 v 2 T x f 2 2
with gradient
T f 2 ( x ) = 2 ( V Σ 2 V T ) x 2 V Σ f = = 2 σ 1 2 v 1 T x v 1 ( 1 ) + 2 σ 2 2 v 2 T x v 2 ( 1 ) + + 2 σ n a 2 v n a T x v n a ( 1 ) 2 σ 1 2 v 1 T x v 1 ( j ) + 2 σ 2 2 v 2 T x v 2 ( j ) + + 2 σ n a 2 v n a T x v n a ( j ) 2 σ i 2 i f ( i ) v i ( 1 ) 2 σ i 2 i f ( i ) v i ( j ) .
Definition 2.
(Upward/Downward Outgoing Gradients) Take a test i, and the functions N f 2 ( x ) and T f 2 ( x ) as in (23) and (25), with the formulas of the gradient vectors of these two functions N f , i ( x ) , T f , i ( x ) as in (24) and (26). Given the two extreme values N f m i n / m a x and T f m i n / m a x for each test, let us define
  • the downward outgoing gradients as the set of gradients calculated on the points on the minimum hyperellipsoid
    { N f , i ( x ) | N f , i ( x ) = N f m i n } and { T f , i ( x ) | T f , i ( x ) = T f m i n }
    they point inward to the region of the thick hyperellipsoid.
  • the Upward Outgoing Gradients as the set of negative gradients of points on the maximum hyperellipsoid
    { N f , i ( x ) | N f , i ( x ) = N f m a x } and { T f , i ( x ) | T f , i ( x ) = T f m a x }
    they point outward the region.
Note that the upward/downward outgoing gradient of function N f 2 ( x ) (or T f 2 ( x ) ) on point x is the normal vector to the tangent plane on the hyperellipsoid on which the point lies. Moreover, these vectors point outward the region defined by Equation (18) (and (19) respectively). In Figure 6, an example of some upward/downward outgoing gradients of function N f 2 ( x ) is shown.
Theorem 1.
Given N tests with values I f , i and N f , i in the closed intervals [ I f m i n , I f m a x ] and [ N f m i n , N f m a x ] , take the set of all the upward/downward outgoing gradients of functions N f , i 2 ( x ) and T f , i 2 ( x ) calculated in the true solution x ¯ a , i.e.,
{ N f , i ( x ¯ a ) for i = 1 , , N | N f , i ( x ¯ a ) = N f m a x } { N f , i ( x ¯ a ) for i = 1 , , N | N f , i ( x ¯ a ) = N f m i n } { T f , i ( x ¯ a ) for i = 1 , , N | T f , i ( x ¯ a ) = T f m a x } { T f , i ( x ¯ a ) for i = 1 , , N | T f , i ( x ¯ a ) = T f m i n } .
If there is at least one outgoing gradient of this set in each orthant of R n a , then the intersection region I z r of Equation (21) reduces to a point.
Proof. 
What we want to show is that given any perturbation δ x of the real solution x ¯ a , there exists at least one condition among (18) and (19) that is not satisfied by the new perturbed point x ¯ a + δ x .
Any sufficiently small perturbation δ x in an orthant in which lies an upward/downward outgoing gradient (from now on "Gradient"), determines an increase/decrease in the value of the hyperellipsoid function relative to that Gradient, that makes the relative condition to be unsatisfied.
Hence, if the Gradient in the orthant considered is upward, it satisfies N f , i ( x ¯ a ) = N f m a x (or analogously with T f , i ) and for each perturbation δ x in the same orthant we obtain
N f , i ( x ¯ a + δ x ) > N f , i ( x ¯ a ) = N f m a x
(or analogously with T f , i ). In the same way, if the Gradient is downward we obtain
N f , i ( x ¯ a + δ x ) < N f , i ( x ¯ a ) = N f m i n
(or analogously with T f , i ).
When in one orthant there are more than one Gradient, it means that more than one condition will be unsatisfied by the perturbed point x ¯ a + δ x for a sufficiently small δ x in that orthant. □

4. Problem Solution

The theory previously presented allows us to build a solution algorithm that can deal with different a-priori information. We will start in Section 4.1 with the ideal case, i.e., with exact knowledge of I f and N f . Then, we generalize to a more practical setting, where we suppose to know an interval that contains the T f values of all the experiments considered and an interval for the N f values. Hence, the estimate solution will satisfy Equations (18) and (19). In this case we describe an algorithm for computing an estimate of the solution, that we will test in Section 5 against a toy model.

4.1. Exact Knowledge of I f and N f

When the information about I f and N f is exact, with the minimum amount of experiments indicated in Section 3 we can find the unbiased parameter estimate as the intersection I z r of the zero-residual sets Z r i corresponding to each experiment. In principle this could be done also following the proof of Lemma 4, but the computation of the v i vectors is quite cumbersome. Since this is an ideal case, we solve it by simply imposing the satisfaction of the various N f and T f conditions (Equation (14)) as an optimization problem:
min x F ( x ) with F ( x ) = i = 1 N ( A a , i x N f , i ) 2 + i = 1 N ( A a , i x f i T f , i ) 2 .
The solution of this problem is unique when the tests are in a sufficient number and satisfies the conditions of Lemma 4.
This nonlinear least-squares problem can be solved using a general nonlinear optimization algorithm, like Gauss–Newton method or Levenberg–Marquardt [8].

4.2. Approximate Knowledge of I f and N f

In practice, as already pointed out in Section 3.2, it is more realistic to know the two intervals that contain all the N f , i and I f , i values for each test i. Then, we know that within the region I z r there is also the exact unbiased parameter solution x ¯ a , that we want at least to approximate. We introduce here an Unbiased Least-Squares (ULS) Algorithm 1 for the computation of this estimate.
Algorithm 1 An Unbiased Least-Squares (ULS) algorithm.
1:
Given a number n t e s t s of available tests, indexed with a number between 1 and n t e s t s , and two intervals, I f m i n , I f m a x and N f m i n , N f m a x , containing the I f and N f values of all tests.
2:
At each iteration we will consider the tests indexed by the interval 1 , i t ; set initially i t = n a .
3:
while i t n t e s t s do
4:
    1) compute a solution with zero residual of the problem (22) with a nonlinear least-squares optimization algorithm,
5:
    2) estimate the size of the zero-residual region as described below in (31),
6:
    3) increment by one the number i t of tests.
7:
end while
8:
Accept the final solution if the estimated region diameter is sufficiently small.
In general, the zero-residual region Z r i of each test contains the true point of the parameters vector, while the estimated iterates with the local optimization usually start from a point outside this region and converge to a point on the boundary of the region.
The ULS estimate can converge to the true solution in two cases:
  • the true solution lies on the border of the region I z r and the estimate reach the border on that point;
  • the region I z r reduces to a dimension smaller than the required accuracy, or reduces to a point.
The size of the intersection set I z r , of the zero-residual regions Z r i , is estimated in the following way.
Let us define an index, that we call region shrinkage estimate, as follows:
s ^ ( x ) = m i n { n | δ P Δ I z r ( x + μ n δ ) > 0 } ,
where we used μ = 1.5 in the experiments below, P = { δ R n a | δ ( i ) ( 1 , 0 , 1 ) i = 1 , , n a } and Δ I z r is the Dirac function of the set I z r .

5. Numerical Examples

Let us consider a classical application example, the equations of a DC motor with a mechanical load, where the electrical variables are governed by the following ordinary differential equation
L I ˙ ( t ) = K ω ( t ) R I ( t ) + V ( t ) f u ( t ) I ( t 0 ) = I 0 ,
where I is the motor current, ω the motor angular speed, V the applied voltage, and f u ( t ) a possible unmodelled component
f u ( t ) = m e r r c o s ( n p o l e s θ ( t ) ) ,
where n p o l e s is the number of poles of the motor, i.e., the number of windings or magnets [9], m e r r the magnitude of the error model and θ the angle, given by the system
ω ˙ ( t ) = θ ( t ) ω ( t 0 ) = ω 0 .
Note that the unknown component f u of this example can be seen as a difference in the potential that is not described by the approximated model. We are interested in the estimation of parameters [ L , K , R ] . In our test the true values were constant values [ L = 0.0035 , K = 0.14 , R = 0.53 ] .
We suppose to know the measurements of I and ω at equally spaced times t 0 , , t N ¯ with step h, such that t k = t 0 + k h , and t k + 1 = t k + h . In Figure 7 we see the plots of the motor speed ω and of the unknown component f u for this experiment.
We compute the approximation of the derivative of the current signal I ˙ ^ ( t k ) with the forward finite difference formula of order one
I ˙ ^ ( t k ) = I ( t k ) I ( t k 1 ) h , for t k = t 1 , , t N ¯
with a step h = 4 × 10 4 . The applied voltage is held constant to the value V ( t ) = 30.0 .
To obtain a more accurate estimate, or to allow the possibility of using higher step size values h, finite differences of higher order can be used, for example the fourth order difference formula
I ˙ ^ ( t k ) = I ( t k 2 h ) 8 I ( t k h ) + 8 I ( t k + h ) I ( t k + 2 h ) 12 h , for t k = t 2 , , t N ¯ 2 .
With the choice of the finite difference formula, we obtain the discretized equations
L I ˙ ^ ( t k ) = K ω ( t k ) R I ( t k ) + V ( t k ) f u ( t k ) , for t k = t 1 , , t N ¯ .
We will show a possible implementation of the method explained in the previous sections, and the results we get with this toy-model example. The comparison is made against the standard least-squares. In particular, we will show that when the information about I f and N f is exact, we have an exact removal of the bias. In case this information is only approximate, which is common in a real application, we will show how the bias asymptotically disappears when the number of experiments increases.
We build each test taking the Equation (35) for n samples in the range t 1 , , t N ¯ , obtaining the linear system
I ˙ ^ ( t k ) ω ( t k ) I ( t k ) I ˙ ^ ( t k + 1 ) ω ( t k + 1 ) I ( t k + 1 ) I ˙ ^ ( t k + n ) ω ( t k + n ) I ( t k + n ) L K R + f u ( t k ) f u ( t k + 1 ) f u ( t k + n ) = V ( t k ) V ( t k + 1 ) V ( t k + n )
so that the first matrix in the equation is A a R n × n a with n a = 3 , the number of parameters to be estimated.
To measure the estimation relative error e ^ r e l we will use the following formula, where x ^ a is the parameter estimate:
e ^ r e l = 1 n a i = 1 n a | | x ^ a ( i ) x ¯ a ( i ) | | 2 | | x ¯ a ( i ) | | 2 .
Note that the tests that we built in the numerical experiments below are simply small chunks of consecutive data, taken from one single simulation for each experiment.
The results have been obtained with a Python code developed by the authors, using NumPy for linear algebra computations and scipy.optimize for the nonlinear least-squares optimization.

5.1. Exact Knowledge of I f and N f

As analyzed in Section 4.1, the solution of the minimization problem (30) is computed with a local optimization algorithm.
Here the obtained results show an error e ^ r e l with an order of magnitude of 10 7 in every test we made. Note that it is also possible to construct geometrically the solution, with exact results.

5.2. Approximate Knowledge of I f and N f

When I f and N f are known only approximately, i.e., we know only an interval that contains all the I f values and an interval that contains all the N f values, we lose the unique intersection of Lemma 4, that would require only n a tests. Moreover, with a finite number of tests we cannot guarantee in general to satisfy the exact hypotheses of Theorem 1. As a consequence, various issues open up. Let’s start by showing in Figure 8 that when all the four conditions of (15) hold with equality, the true solution lies on the boundary of the region I z r as already mentioned in Section 3.2. If this happens, then with the conditions of Theorem 1 on the upward/downward outgoing gradients, the region I z r is a point. When all the four conditions of (15) hold with strict inequalities, the true solution lies inside the region I z r (Figure 8b). From a theoretical point of view this distinction has a big importance, since it means that the zero-residual region can or cannot be reduced to a single point. From a practical point of view it becomes less important, for the moment, since we cannot guarantee that the available tests will reduce I z r exactly to a single point and we will arrive most of the times to an approximate estimate. This can be more or less accurate, but this depends on the specific application, and this is out of the scope of the present work.
To be more precise, when the conditions of Theorem 1 are not satisfied, there is an entire region of the parameters space which satisfies exactly problem (30), but only one point of this region is the true solution x ¯ a . As more tests are added and intersected together, the zero-residual region I z r tends to reduce, simply because it must satisfy an increasing number of inequalities. In Figure 9 we can see four iterations taken from an example, precisely with 3, 5, 9 and 20 tests intersected and m e r r = 19 . With only three tests (Figure 9a), there is a big region I z r (described by the mesh of small dots), and here we see that the true solution (thick point) and the current estimate (star) stay on opposite sides of the region, as accidentally happens. With five tests (Figure 9a) the region has shrunk considerably and the estimate is reaching the boundary (in the plot it is still half-way), and even more with nine tests (Figure 9c). The convergence arrives here before the region collapses to a single point, because accidentally the estimate has approached the region boundary at the same point where the true solution is located.
In general, the zero-residual region Z r i (20) of each test contains the true solution, while the estimate arrives from outside the region and stops when it bumps the border of the intersection region I z r (21). For this reason we have convergence when the region that contains the true solution is reduced to a single point, and the current estimate x ^ a does not lie in a disconnected sub-region of I z r different from the one in which the true solution lies. Figure 10 shows an example of an intersection region I z r which is the union of two closed disconnected regions: this case creates a local minimum in problem (30).
In Figure 11 we see the differences N f m a x N f m i n and T f m a x T f m i n vs. m e r r . The differences are bigger for higher values of the model error. It seems that this is the cause of a more frequent creation of local minima.
Figure 12 synthesizes the main results that we have experienced with this new approach. Globally it shows a great reduction of the bias contained in the standard least-squares estimates; indeed, we had to use the logarithmic scale to enhance the differences in the behaviour of the proposed method while varying m e r r . In particular,
  • with considerable levels of modelling error, let us say m e r r between 2 and 12, the parameter estimation error e ^ r e l is at least one order of magnitude smaller that that of least-squares; this is accompanied by high levels of shrinkage of the zero-residual region (Figure 12b);
  • with higher levels of m e r r , we see a low shrinkage of the zero-residual region and consequently an estimate whose error is highly oscillating, depending on where the optimization algorithm has brought it to get in contact with the zero-residual region;
  • at m e r r = 18 we see the presence of a local minimum, due to the falling to pieces of the zero-residual region as in Figure 10: the shrinkage at the true solution is estimated to be very high, while at the estimated solution it is quite low, since it is attached to a disconnected, wider sub-region.
  • the shrinking of the zero-residual region is related to the distribution of the outgoing gradients, as stated by Theorem 1: in Figure 12d we see that in the experiment with m e r r = 18 they occupy only three of eight orthants, while in the best results of the other experiments the gradients distribute themselves in almost all orthants (not shown).
It is evident from these results that for lower values of modelling error m e r r , it is much easier to produce tests that reduce the zero-residual region to a quite small interval of R n a , while for high values of m e r r it is much more difficult and the region I z r can even fall to pieces, thus creating local minima. It is also evident that a simple estimate of the I z r region size, like (31), can reliably assess the quality of the estimate produced by the approach here proposed, as summarized in Figure 12c.

6. Conclusions

In this paper we have analyzed the bias commonly arising in parameter estimation problems where the model is lacking some deterministic part of the system. This result is useful in applications where an accurate estimation of parameters is important, e.g., in physical (grey-box) modelling typically arising in the model-based design of multi-physical systems, see e.g., the motivations that the authors did experience in the design of digital twins of controlled systems [10,11,12] for virtual prototyping, among an actually huge literature.
At this point, the method should be tested in a variety of applications, since the ULS approach here proposed is not applicable black-box as Least-Squares are. Indeed, it requires some additional a priori information. Moreover, since the computational complexity of the method here presented is relevant, efficient computational methods must be considered and will be a major issue in future investigations.
Another aspect that is even worth to deepen is also the possibility to design tests that contribute optimally to the reduction of the zero-residual region.

Author Contributions

Conceptualization, methodology, validation, formal analysis, investigation, software, resources, data curation, writing—original draft preparation, writing—review and editing, visualization: M.G. and F.M.; supervision, project administration, funding acquisition: F.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the project DOR1983079/19 from the University of Padova and by the doctoral grant “Calcolo ad alte prestazioni per il Model Based Design" from Electrolux Italia s.p.a.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Björck, A. Numerical Methods for Least Squares Problems; Society for Industrial and Applied Mathematics: Philadelphia, PA, USA, 1996. [Google Scholar] [CrossRef]
  2. Huffel, S.V.; Markovsky, I.; Vaccaro, R.J.; Söderström, T. Total least squares and errors-in-variables modeling. Signal Process. 2007, 87, 2281–2282. [Google Scholar] [CrossRef]
  3. Söderström, T.; Soverini, U.; Mahata, K. Perspectives on errors-in-variables estimation for dynamic systems. Signal Process. 2002, 82, 1139–1154. [Google Scholar] [CrossRef]
  4. Van Huffel, S.; Vandewalle, J. The Total Least Squares Problem: Computational Aspects and Analysis; Frontiers in Applied Mathematics (Book 9); Society for Industrial and Applied Mathematics: Philadelphia, PA, USA, 1991. [Google Scholar]
  5. Peck, C.C.; Beal, S.L.; Sheiner, L.B.; Nichols, A.I. Extended least squares nonlinear regression: A possible solution to the “choice of weights” problem in analysis of individual pharmacokinetic data. J. Pharmacokinet. Biopharm. 1984, 12, 545–558. [Google Scholar] [CrossRef] [PubMed]
  6. Meyer, C.D. Matrix Analysis and Applied Linear Algebra; Society for Industrial and Applied Mathematics: Philadelphia, PA, USA, 2000. [Google Scholar]
  7. Hansen, P.C. Oblique projections and standard-form transformations for discrete inverse problems. Numer. Linear Algebra Appl. 2013, 20, 250–258. [Google Scholar] [CrossRef] [Green Version]
  8. Nocedal, J.; Wright, S. Numerical Optimization; Springer: Berlin, Germany, 1999. [Google Scholar]
  9. Krause, P.C. Analysis of Electric Machinery; McGraw Hill: New York, NY, USA, 1986. [Google Scholar]
  10. Beghi, A.; Marcuzzi, F.; Rampazzo, M.; Virgulin, M. Enhancing the Simulation-Centric Design of Cyber-Physical and Multi-physics Systems through Co-simulation. In Proceedings of the 2014 17th Euromicro Conference on Digital System Design, Verona, Italy, 27–29 August 2014; pp. 687–690. [Google Scholar] [CrossRef]
  11. Beghi, A.; Marcuzzi, F.; Rampazzo, M. A Virtual Laboratory for the Prototyping of Cyber-Physical Systems. IFAC-PapersOnLine 2016, 49, 63–68. [Google Scholar] [CrossRef]
  12. Beghi, A.; Marcuzzi, F.; Martin, P.; Tinazzi, F.; Zigliotto, M. Virtual prototyping of embedded control software in mechatronic systems: A case study. Mechatronics 2017, 43, 99–111. [Google Scholar] [CrossRef]
Figure 1. Case n a = 1 . (a): Case n a = 1 , m = n = 2 . Solutions with the condition on N f . In the figure: the true decomposition obtained imposing both the conditions (blue), the orthogonal decomposition (red), another possible decomposition (green) that satisfy the same norm condition N f , but different I f ; (b): Case n a = 1 . Intensity Ratio value w.r.t the norm of the vector A a x a : given a fixed value of Intensity Ratio there can be two solution, i.e. two possible decomposition of f as sum of two vectors with the same Intensity Ratio; (c): Case n a = 1 , m = n = 2 . Solutions with the condition on I f . In the figure: the true decomposition obtained imposing both the conditions (blue), the orthogonal decomposition (red), another possible decomposition (green) with the same intensity ratio I f , but different N f .
Figure 1. Case n a = 1 . (a): Case n a = 1 , m = n = 2 . Solutions with the condition on N f . In the figure: the true decomposition obtained imposing both the conditions (blue), the orthogonal decomposition (red), another possible decomposition (green) that satisfy the same norm condition N f , but different I f ; (b): Case n a = 1 . Intensity Ratio value w.r.t the norm of the vector A a x a : given a fixed value of Intensity Ratio there can be two solution, i.e. two possible decomposition of f as sum of two vectors with the same Intensity Ratio; (c): Case n a = 1 , m = n = 2 . Solutions with the condition on I f . In the figure: the true decomposition obtained imposing both the conditions (blue), the orthogonal decomposition (red), another possible decomposition (green) with the same intensity ratio I f , but different N f .
Mathematics 08 00982 g001
Figure 2. Case n a = 2 . (a): Case n a = 2 , m = n = 3 , with A a x a = [ A a ( 1 ) A a ( 2 ) ] [ x a ( 1 ) x a ( 2 ) ] T . In the figure: the true decomposition (blue), the orthogonal decomposition (red), another possible decomposition of the infinite ones (green); (b): Case n a = 2 , m = n = 3 . Projection of the two circumferences on the subspace A a , and projections of the possible decompositions of f (red, blue and green).
Figure 2. Case n a = 2 . (a): Case n a = 2 , m = n = 3 , with A a x a = [ A a ( 1 ) A a ( 2 ) ] [ x a ( 1 ) x a ( 2 ) ] T . In the figure: the true decomposition (blue), the orthogonal decomposition (red), another possible decomposition of the infinite ones (green); (b): Case n a = 2 , m = n = 3 . Projection of the two circumferences on the subspace A a , and projections of the possible decompositions of f (red, blue and green).
Mathematics 08 00982 g002
Figure 3. Case n a = 3 . (a): Case n a = 3 , m = n = 4 , n n a = 1 : in the picture f ¯ , i.e., the projection of f on A a . The decompositions that satisfies the conditions on I f and N f are the ones with f a that lies on the red circumference on the left. The spheres determined by the conditions are shown in yellow for the vector f a and in blue for the vector f a a . Two feasible decompositions are shown in blue and green; (b): Case n a = 3 . Intersection of three hyperellipsoids, set of the solutions x a of three different tests, in the space R n a = 3 .
Figure 3. Case n a = 3 . (a): Case n a = 3 , m = n = 4 , n n a = 1 : in the picture f ¯ , i.e., the projection of f on A a . The decompositions that satisfies the conditions on I f and N f are the ones with f a that lies on the red circumference on the left. The spheres determined by the conditions are shown in yellow for the vector f a and in blue for the vector f a a . Two feasible decompositions are shown in blue and green; (b): Case n a = 3 . Intersection of three hyperellipsoids, set of the solutions x a of three different tests, in the space R n a = 3 .
Mathematics 08 00982 g003
Figure 4. Case n a 3 . (a): Case n a = 4 . Intersection of four hyperellipsoids, set of the solutions x a of four different tests, in the space R n a = 4 ; (b): Case n a = 3 . Example of three tests for which the solution has an intersection bigger than one single point. The three ( n a 1 ) -dimensional subspaces F 1 , F 12 , F 13 in the space generated by A a , 1 intersect in a line and their three orthogonal vectors are not linearly independent.
Figure 4. Case n a 3 . (a): Case n a = 4 . Intersection of four hyperellipsoids, set of the solutions x a of four different tests, in the space R n a = 4 ; (b): Case n a = 3 . Example of three tests for which the solution has an intersection bigger than one single point. The three ( n a 1 ) -dimensional subspaces F 1 , F 12 , F 13 in the space generated by A a , 1 intersect in a line and their three orthogonal vectors are not linearly independent.
Mathematics 08 00982 g004
Figure 5. Examples of the exact and approximated conditions on a test with n a = 2 . In the left equation the two black ellipsoids are the two constraints of the right system of (14), while in the right figure the two couples of concentric ellipsoids are the borders of the thick ellipsoids defined by (16) and the blue region Z r i is the intersection of (18) and (19). The black dot in both the figures is the true solution. (a): Exact conditions on N f and T f ; (b): Approximated conditions on N f and T f .
Figure 5. Examples of the exact and approximated conditions on a test with n a = 2 . In the left equation the two black ellipsoids are the two constraints of the right system of (14), while in the right figure the two couples of concentric ellipsoids are the borders of the thick ellipsoids defined by (16) and the blue region Z r i is the intersection of (18) and (19). The black dot in both the figures is the true solution. (a): Exact conditions on N f and T f ; (b): Approximated conditions on N f and T f .
Mathematics 08 00982 g005
Figure 6. In the figure some upward/downward outgoing gradients are shown: the blue internal ones are downward outgoing gradients calculated on points x on the internal ellipsoid with N f , i ( x ) = N f m i n , while the external red ones are upward outgoing gradients calculated on points x on the external ellipsoid with N f , i ( x ) = N f m a x .
Figure 6. In the figure some upward/downward outgoing gradients are shown: the blue internal ones are downward outgoing gradients calculated on points x on the internal ellipsoid with N f , i ( x ) = N f m i n , while the external red ones are upward outgoing gradients calculated on points x on the external ellipsoid with N f , i ( x ) = N f m a x .
Mathematics 08 00982 g006
Figure 7. The plots of (a) ω ( t ) and (b) f u ( t ) in the experiment.
Figure 7. The plots of (a) ω ( t ) and (b) f u ( t ) in the experiment.
Mathematics 08 00982 g007
Figure 8. Two examples of (zero-residual) intersection regions I z r R 3 with different location of the true solution: inside the region or on its border. For graphical reasons the region has been discretized and the dots are the grid nodes; the bigger ball (thick point) is the true solution. (a): The true solution (ball) is on the border of I z r ; (b): The true solution (ball) is internal to I z r .
Figure 8. Two examples of (zero-residual) intersection regions I z r R 3 with different location of the true solution: inside the region or on its border. For graphical reasons the region has been discretized and the dots are the grid nodes; the bigger ball (thick point) is the true solution. (a): The true solution (ball) is on the border of I z r ; (b): The true solution (ball) is internal to I z r .
Mathematics 08 00982 g008
Figure 9. The intersection region I z r R 3 at different number of tests involved. For graphical reasons the region has been discretized and the dots are the grid nodes; the bigger ball is the true solution and the star is the current estimate in the experiment. (a) 3 tests; (b) 5 tests; (c) 9 tests; (d) 20 tests.
Figure 9. The intersection region I z r R 3 at different number of tests involved. For graphical reasons the region has been discretized and the dots are the grid nodes; the bigger ball is the true solution and the star is the current estimate in the experiment. (a) 3 tests; (b) 5 tests; (c) 9 tests; (d) 20 tests.
Mathematics 08 00982 g009
Figure 10. The intersection region I z r R 3 at different number of tests involved. On the left a few tests have created a single connected region while, on the right, adding more tests have splitted it into two subregions. For graphical reasons the region has been discretized and the dots are the grid nodes; the bigger ball is the true solution and the star is the current estimate in the experiment. (a) A (portion of a) connected region I z r ; (b) A region I z r split into two not connected sub regions.
Figure 10. The intersection region I z r R 3 at different number of tests involved. On the left a few tests have created a single connected region while, on the right, adding more tests have splitted it into two subregions. For graphical reasons the region has been discretized and the dots are the grid nodes; the bigger ball is the true solution and the star is the current estimate in the experiment. (a) A (portion of a) connected region I z r ; (b) A region I z r split into two not connected sub regions.
Mathematics 08 00982 g010
Figure 11. The three plots show the values assumed by the extreme values (15) as a function of m e r r . (a): { I f m i n , I f m a x } vs . m e r r ; (b): { N f m i n , N f m a x } vs . m e r r ; (c) { T f m i n , T f m a x } vs . m e r r .
Figure 11. The three plots show the values assumed by the extreme values (15) as a function of m e r r . (a): { I f m i n , I f m a x } vs . m e r r ; (b): { N f m i n , N f m a x } vs . m e r r ; (c) { T f m i n , T f m a x } vs . m e r r .
Mathematics 08 00982 g011
Figure 12. The plots summarize the results obtained by the ULS approach to parameter estimation no the model problem explained at the beginning of this section. (a): The relative estimation error (37) vs. m e r r ; (b): The I z r region shrinkage estimate (31) vs. m e r r ; (c): The relative estimation error (37) vs. the estimate of the I z r region shrinkage, considering the experiments with m e r r [ 2 , 20 ] ; (d): A three dimensional view of the Outgoing Gradients at the last iteration of the experiment with m e r r = 18 .
Figure 12. The plots summarize the results obtained by the ULS approach to parameter estimation no the model problem explained at the beginning of this section. (a): The relative estimation error (37) vs. m e r r ; (b): The I z r region shrinkage estimate (31) vs. m e r r ; (c): The relative estimation error (37) vs. the estimate of the I z r region shrinkage, considering the experiments with m e r r [ 2 , 20 ] ; (d): A three dimensional view of the Outgoing Gradients at the last iteration of the experiment with m e r r = 18 .
Mathematics 08 00982 g012

Share and Cite

MDPI and ACS Style

Gatto, M.; Marcuzzi, F. Unbiased Least-Squares Modelling. Mathematics 2020, 8, 982. https://doi.org/10.3390/math8060982

AMA Style

Gatto M, Marcuzzi F. Unbiased Least-Squares Modelling. Mathematics. 2020; 8(6):982. https://doi.org/10.3390/math8060982

Chicago/Turabian Style

Gatto, Marta, and Fabio Marcuzzi. 2020. "Unbiased Least-Squares Modelling" Mathematics 8, no. 6: 982. https://doi.org/10.3390/math8060982

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop