Weighted Total Least Squares (WTLS) Solutions for Straight Line Fitting to 3D Point Data

In this contribution the fitting of a straight line to 3D point data is considered, with Cartesian coordinates xi, yi, zi as observations subject to random errors. A direct solution for the case of equally weighted and uncorrelated coordinate components was already presented almost forty years ago. For more general weighting cases, iterative algorithms, e.g., by means of an iteratively linearized Gauss–Helmert (GH) model, have been proposed in the literature. In this investigation, a new direct solution for the case of pointwise weights is derived. In the terminology of total least squares (TLS), this solution is a direct weighted total least squares (WTLS) approach. For the most general weighting case, considering a full dispersion matrix of the observations that can even be singular to some extent, a new iterative solution based on the ordinary iteration method is developed. The latter is a new iterative WTLS algorithm, since no linearization of the problem by Taylor series is performed at any step. Using a numerical example it is demonstrated how the newly developed WTLS approaches can be applied for 3D straight line fitting considering different weighting cases. The solutions are compared with results from the literature and with those obtained from an iteratively linearized GH model.


Introduction
Modern geodetic instruments, such as terrestrial laser scanners, provide the user directly with 3D coordinates in a Cartesian coordinate system. However, in most cases these 3D point data are not the final result. For an analysis of the recorded data or for a representation using computer-aided design (CAD), a line, curve or surface approximation with a continuous mathematical function is required.
In this contribution the fitting of a spatial straight line is discussed considering the coordinate components x i , y i , z i of each point P i as observations subject to random errors, which results in a nonlinear adjustment problem. An elegant direct least squares solution for the case of equally weighted and uncorrelated observations has already been presented in 1982 by Jovičić et al. [1]. Unfortunately, this article was very rarely cited in subsequent publications, which is probably due to the fact that it was written in Croatian language. Similar least squares solutions, direct as well, have been published by Kahn [2] and Drixler ([3], pp. 46-47) some years later. In these contributions, it was shown that the problem of fitting a straight line to 3D point data can be transformed into an eigenvalue problem. An iterative least squares solution for fitting a straight line to equally weighted and uncorrelated 3D points has been presented by Späth [4], by minimizing the sum of squared orthogonal distances of the observed points to the requested straight line.
For more general weighting schemes iterative least squares solutions have been presented by Kupferer [5] and Snow and Schaffrin [6]. In both contributions various nonlinear functional models were introduced and tested and the Gauss-Newton approach has been employed for an iterative least squares solution, which involves the linearization of the functional model to solve the adjustment problem within the Gauss-Markov (GM) or the Gauss-Helmert (GH) models. The linearization of originally nonlinear functional models is a very popular approach in adjustment calculation, where the solution is then determined by iteratively linearized GM or GH models, see e.g., the textbooks by Ghilani [7], Niemeier [8] or Perovic [9]. Pitfalls to be avoided in the iterative adjustment of nonlinear problems have been pointed out by Pope [10] and Lenzmann and Lenzmann [11].
Fitting a straight line to 3D point data can also be considered as an adjustment problem of type total least squares (TLS) for an errors-in-variables (EIV) model, as already pointed out by Snow and Schaffrin [6]. For some weighting cases, problems expressed within the EIV model can have a direct solution using singular value decomposition (SVD). This solution is known under the name Total Least Squares (TLS) and has been firstly proposed by Golub and Van Loan [12], Van Huffel and Vandewalle [13] or Van Huffel and Vandewalle ( [14], p. 33 ff.). The TLS solution is obtained by computing the roots of a polynomial, i.e., by solving the characteristic equation of the eigenvalues, which is identical in the case of fitting a straight line to 3D point data to the solutions of Jovičić et al. [1], Kahn [2] and Drixler ([3], pp. 46-47). This is something that has been observed by Malissiovas et al. [15], where a relationship has been presented between TLS and the direct least squares solution of the same adjustment problem, while postulating uncorrelated and equally weighted observations.
To involve more general weight matrices in the adjustment procedure, iterative algorithms have been presented in the TLS literature without linearizing the underlying problem by Taylor series at any step of the solution process. These are algorithmic approaches known as weighted total least squares (WTLS), presented e.g., by Schaffrin and Wieser [16], Shen et al. [17] or Amiri and Jazaeri [18]. A good overview of such algorithms, as well as alternative solution strategies can be found in the dissertation theses of Snow [19] and Malissiovas [20]. An attempt to find a WTLS solution for straight line fitting to 3D point data was made by Guo et al. [21].
To avoid confusion, it is to clarify that the terms TLS and WTLS refer to algorithmic approaches for obtaining a least squares solution, which is either direct or iterative but without linearizing the problem by Taylor series at any step. This follows the statement of Snow ([19], p. 7), that "the terms total least-squares (TLS) and TLS solution [. . . ] will mean the least-squares solution within the EIV model without linearization". Of course, a solution within the GH model is more general in the sense that it can be utilized to solve any nonlinear adjustment problem, while TLS and WTLS algorithms can treat only a certain class of nonlinear adjustment problems. This has been firstly discussed by Neitzel and Petrovic [22] and Neitzel [23], who showed that the TLS estimate within an EIV model can be identified as a special case of the method of least squares within an iteratively linearized GH model.
To the extent of our knowledge, a WTLS algorithm for fitting a straight line to 3D point data has not been presented yet. Therefore, in this study we derive two novel WTLS algorithms for the discussed adjustment problem considering two different cases of stochastic models: (i) pointwise weights, i.e., coordinate components with same precision for each point and no correlations between them, (ii) general weights, i.e., correlated coordinate components of individual precision including singular dispersion matrices.
The adjustment problem resulting from case (i) can still be solved directly, i.e., a direct WTLS solution, presented in Section 2.1 of this paper. For case (ii) an iterative solution without linearizing the problem by Taylor series is derived in Section 2.2, i.e., an iterative WTLS solution. Both solutions are based on the work of Malissiovas [20], where similar algorithms have been presented for the solution of other typical geodetic tasks, such as straight line fitting to 2D point data, plane fitting to 3D point data and 2D similarity coordinate transformation. The WTLS solution for straight line fitting to 3D point data will be derived from a geodetic point of view by means of introducing residuals for all observations, formulating appropriate condition and constraint equations, setting up and solving the resulting normal equation systems.

Straight Line Fitting to 3D Point Data
A straight line in 3D space can be expressed by as explained e.g., in the handbook of mathematics by Bronhstein et al. ( [24], p. 217). This equation defines a line that passes through a point with coordinates x 0 , y 0 and z 0 and is parallel to a direction vector with components a, b and c. Since the number of unknown parameters is six, two additional constraints between the unknown parameters have to be taken into account for a solution, as Snow and Schaffrin [6] have clearly explained. Proper constraints will be selected at a later point of the derivations.
Considering an overdetermined configuration for which we want to obtain a least squares solution, we introduce for all points P i residuals v xi , v y i , v zi for the observations x i , y i , z i assuming that they are normally distributed with v ∼ (0, σ 2 0 Q LL ). Here the 1 × 3n vector v contains the residuals v xi , v y i , v zi , Q LL is the 3n × 3n corresponding cofactor matrix and σ 2 0 the theoretical variance factor. Based on (1), the nonlinear conditions equations can be formulated for each point P i , with i = 1, . . . , n and n being the number of observed points. A functional model for this problem can be expressed by two of these three nonlinear condition equations per observed point. Using all three condition equations for solving the problem would lead to a singular normal matrix due to linearly dependent normal equations.

Direct Total Least Squares Solution for Equally Weighted and Uncorrelated Observations
Considering all coordinate components x i , y i , z i for all points P i as equally weighted and uncorrelated observations with a least squares solution can be derived by minimizing the objective function A direct least squares solution of this problem, respectively a TLS solution, has been presented by Malissiovas et al. [15] and Malissiovas [20]. According to these investigations, equally weighted residuals correspond to the normal distances as deviation measures between the observed 3D point cloud and the straight line to be computed. Thus, the objective function can be written equivalently as An expression for the squared normal distances can be found in the handbook of Bronhstein et al. ( [24], p. 218). An appropriate parameterization of the problem involves the substitution of the unknown parameters y 0 , x 0 and z 0 with the coordinates of the center of mass of the observed 3D point data. A proof for this parameter replacement has been given by Jovičić et al. [1] and concerns only the case of equally weighted and uncorrelated observations. A solution for the line parameters can be computed by minimizing the objective function (6), under the constraint or by searching for stationary points of the Lagrange function with k denoting the Lagrange multiplier. The line parameters can be computed directly from the normal equations, either by solving a characteristic cubic equation or an eigenvalue problem. A detailed explanation of this approach was given by Malissiovas ([20], p. 74 ff.). In the following subsections we will give up the restriction of equally weighted and uncorrelated coordinate components and derive least squares solutions considering more general weighting schemes for the observations.

Direct Weighted Total Least Squares Solution
In this case we consider the coordinate components x i , y i , z i of each point P i to be uncorrelated and of equal precision with yielding the corresponding (pointwise) weights A least squares solution of the problem can be found by minimizing the objective function Taking into account the relation of the squared residuals with the normal distances of Equation (5), it is possible to express the objective function as Using the constraint (8), the expression of the normal distances can be written as A further simplification of the problem is possible, by replacing the unknown parameters y 0 , x 0 and z 0 with the coordinates of the weighted center of mass of the 3D point data It can be proven that this parameterization is allowed and possible, following the same line of thinking as in the study of Jovičić et al. [1]. Therefore, for this weighted case we can write the normal distances as with the coordinates reduced to the weighted center of mass of the observed 3D point data Searching for stationary points of the Lagrange function leads to the system of normal equations ∂K ∂c and Equations (19) to (21) are a homogeneous system of equations which are linear in the unknown line parameters a, b and c, and has a nontrivial solution if with the elements of the determinant Equation (23) is a cubic characteristic equation with the unknown parameter k and a solution is given by Bronhstein et al. ( [24], p. 63). The unknown line parameters a, b and c can be computed either by substituting k min into Equations (19)-(21) under the constraint (22) or by transforming the equation system and solving an eigenvalue problem, i.e., the direct WTLS solution.

Iterative Weighted Total Least Squares Solution
In this case, we consider the fact that the coordinate components x i , y i , z i of each point P i are not original observations. They are either (i) derived from single point determinations, e.g., using polar elementary measurements of slope distance ρ i , direction φ i and tilt angle θ i , e.g., from measurements with a terrestrial laser scanner, so that the coordinates result from the functional relationship Using the standard deviations σ ρ i , σ φ i and σ θ i of the elementary measurements, a 3 × 3 variance-covariance matrix can be provided for the coordinate components x i , y i , z i of each point P i by variance-covariance propagation based on (25). Covariances among the n different points do not occur in this case. or (ii) derived from a least squares adjustment of an overdetermined geodetic network. From the solution within the GM or GH model, the 3n × 3n variance-covariance matrix of the coordinate components x i , y i , z i of all points P i can be obtained. This matrix is a full matrix, considering also covariances among the n different points. It may even have a rank deficiency in case the coordinates were determined by a free network adjustment.
The variances and covariances from (i) or (ii) can then be assembled in a dispersion matrix for the coordinate components as "derived observations". In case of a non-singular dispersion matrix, it is possible to compute the respective weight matrix Starting with the functional model (2), we choose to work with the nonlinear condition equations Two additional constraints must be taken into account in this case. For example, Snow and Schaffrin [6] proposed the constraints A further development of an algorithmic solution using these constraints is possible. However, we choose to parameterize the functional model so that an additional constraint is unnecessary and the equations become simpler. Therefore, we choose to fix one coordinate of the point on the line as well as one of the unknown components of the direction vector This simplification of the functional model has been used by Borovička et al. [25] for a similar practical example, this of estimating the trajectory of a meteor. As already mentioned in that work, the resulting direction vector of the straight line can be transformed to a unit vector afterwards, i.e., a solution using the constraint a 2 + b 2 + c 2 = 1. The simplified functional model can be written in vector notation as with the vectors x c , y c and z c containing the coordinates of the observed points and vectors v x , v y and v z the respective residuals. Vector e is a vector of ones, with length equal to the number of observed points n.
A weighted least squares solution of this problem can be derived by minimizing the objective function or by searching for stationary points of the Lagrange function with two distinct vectors k 1 and k 2 of Lagrange multipliers. Taking the partial derivatives with respect to all unknowns and setting them to zero results in the system of normal equations To derive explicit expressions for the residual vectors, we write Equations (35)-(37) as    P xx P xy P xz P yx P yy P yz P zx P zy P zz which leads to the solution for the residual vectors Inserting these expressions for the residual vectors into Equations (38) and (39) yields with the auxiliary matrices W 1 , W 2 and W 3 as Using Equations (49), (50) and (40)-(43), the nonlinear equation system can be set up. To express this system of equations into a block matrix form and to obtain a symmetrical matrix of normal equations in the following, it is advantageous to add the terms −a (v z − ez) and −b (v z − ez) to both sides of the first two equations, leading to the system of equations Arranging the unknowns in a vector the equation system (55) can be written as with the matrix of normal equations constructed using the matrices and The right-hand side of the equation system (57) reads It is important to point out that matrix N and vector n contain the unknown parameters a, b and v z in their entries. Therefore, to express and solve these normal equations that live in the "nonlinear world" with the help of vectors and matrices (that only exist in the "linear world"), appropriate approximate values a 0 , b 0 and v 0 z have to be introduced for all auxiliary matrices required for the setup of matrix N and vector n. The solution for the vector of unknowns can be computed bŷ without the need of a linearization by Taylor series at any step of the calculation process. The WTLS solution for the line parameters can be computed following the ordinary iteration method, as it is explained for example by Bronhstein ( [24], p. 896). Therefore, the solutionsâ,b andṽ z , after stripping them of their random character, are to be substituted as new approximate values as long as necessary until a sensibly chosen break-off condition is met. For example, the maximum absolute difference between two consecutive solutions must be smaller than a predefined threshold , which in this problem can be formulated as with |·| denoting the absolute value. The predicted residual vectorsṽ x ,ṽ y ,ṽ z can be computed from Equations (46)-(48). The iterative procedure for the presented WTLS solution can be found in Algorithm 1.

Algorithm 1 Iterative WTLS solution
Choose approximate values for a 0 , b 0 and v 0 z . Define parameter c = 1. Set threshold for the break-off condition of the iteration process. Set parameter d a = d b = ∞, for entering the iteration process.
with ||·|| denoting the Euclidean norm. The derived parameters a n , b n and c n refer to the normalized components of a unit vector, that is parallel to the requested line, with a 2 n + b 2 n + c 2 n = 1 (65)

WTLS Solution with Singular Dispersion Matrices
The algorithmic approach presented in Section 2.3 can also cover cases when dispersion matrices are singular. Such a solution depends on the inversion of matrix N (58), which depends on the rank deficiency of matrix W (59). Following the argumentation of Malissiovas ([20], p. 112), a criterion that ensures a unique solution of the problem can be described in this case by with rank of W ≤ 2n, with 2n = number of condition equations, since for each of the observed n points two condition equations from Equation (2) are taken into account; -rank of A = m, with m = number of unknown parameters.
In cases of singular dispersion matrices, the rank of matrix W will be smaller than 2n. A unique solution will exist if the rank of the augmented matrix [W | A] is equal to the number of condition equations 2n of the problem. It is important to mention that the developed idea is based on the Neitzel-Schaffrin criterion, which has been firstly proposed by Neitzel and Schaffrin [26,27], particularly for a solution of an adjustment problem within the GH model when singular dispersion matrices must be employed.

A Posteriori Error Estimation
In this section we want to determine the variance-covariance matrix of the estimated parameters. The following derivations can be used for computing the a posteriori stochastic information for all weighted cases discussed in this investigation, i.e., the direct and the iterative WTLS solutions. Therefore, we will employ the fundamental idea of variance-covariance propagation. This is a standard procedure explained by many authors, like for example in the textbooks of Wells and Krakiwsky ( [28], p. 20), Mikhail ([29], p. 76 ff.) or Niemeier ([8], p. 51 ff.) and has been further employed in the GM and the GH model, so that the a posteriori stochastic results can be computed using directly the developed matrices from each model. A detailed explanation is given by Malissiovas ([20], p. 31 ff.).
As we have already mentioned in the introduction of this article, a TLS, respectively a WTLS solution, can be regarded as a special case of a least squares solution within the GH model. From the presented WTLS algorithm we observe that the derived matrix of normal Equation (58) is equal to the matrix if it was computed within the GH model. Therefore, it is possible to compute and extract the dispersion matrix for the unknown parameters from The a posteriori variance factor isσ with vector v holding all residuals and r denoting the redundancy of the adjustment problem. In case of a singular dispersion matrix, it is not possible to compute the weight matrix P, as in Equation (27). Therefore, we can make use of the solution for the residual vectors from Equation (45) and insert them in Equation (69) to obtainσ The variance-covariance matrix of the unknown parameters can be then derived by To derive the variance-covariance matrix of the normalized vector components (64), we can explicitly write the equations a n =â â 2 +b 2 + c 2 , b n =b â 2 +b 2 + c 2 , c n = c with c = 1. Following the standard procedure of the variance-covariance propagation in nonlinear cases, we can write the Jacobian matrix Taking into account the variances and covariances of the line parametersâ andb from (71) we can compute the variance-covariance matrix of the normalized components Σ a n b n c n = FΣâbF T =    σ 2 a n σ a n b n σ a n c n σ b n a n σ 2 b n σ b n c n σ c n a n σ c n b n σ 2

Numerical Examples
In this section we present the solutions for fitting a straight line to 3D point data using the TLS approach from Section 2.1 and the newly developed WTLS approaches from Sections 2.2 and 2.3. The utilized dataset for this investigaiton consists of n = 50 points and originates from the work of Petras and Podlubny [30]. It has been utilized also by Snow and Schaffrin, see Table A1 in [6], for a solution within the GH model, which will be used here for validating the results of the presented WTLS solutions. Three different stochastic models will be imposed in the following:

1.
equal weights, i.e., coordinate components x i , y i , z i as equally weighted and uncorrelated observations, 2.
pointwise weights, i.e., coordinate components with same precision for each point and without correlations, 3.
general weights, i.e., correlated coordinate components of individual precision including singular dispersion matrices.

Equal Weights
For the first case under investigation, we consider all coordinate components x i , y i , z i as equally weighted and uncorrelated observations, yielding the weights as shown in (3). The least squares solution of this problem within the GH model, presented by Snow and Schaffrin [6], is listed in Table 1. Table 1. Least squares solution within the Gauss-Helmert (GH) model of Snow and Schaffrin [6].

Line Orientation Components Standard Deviation
Parameterb y = a n 0.219309 0.077523 Parameterb  Table 2. Numerically equal results have been derived by using the direct WTLS approach of Section 2.2 by setting all weights equal to one. Table 2. Direct TLS solution from Section 2.1.

Line Orientation Components Standard Deviation
Parameter a n 0. Comparing the solution with the one presented in Table 1, it can be concluded that the numerical results for the parameters coincide within the specified decimal places. Regarding the small difference in the numerical value for the variance factor, it is to be noted that the value in Table 2 was confirmed by two independent computations.
Furthermore, a point on the line can be easily computed using the equations of the functional model (2), as long as the direction vector parallel to the requested line is known. Alternatively, all the adjusted points will lie on the requested straight line, which can be simply computed by adding the computed residuals to the measured coordinates.

Pointwise Weights
For the second weighted case under investigation, we consider the coordinate components x i , y i , z i of each point P i to be uncorrelated and of equal precision. From the standard deviations listed in Table 3, the corresponding pointwise weights can be obtained from (11). The direct WTLS solution for the line orientation components is shown in Table 4. The presented results are numerically equal to the iterative WTLS solution using the algorithmic approach of Section 2.3, as well as the solution within the GH model.

General Weights
For the last weighted case in this investigation, we impose the most general case, i.e., correlated coordinate components with individual precision resulting in a singular dispersion matrix. To obtain such a matrix for our numerical investigations, we firstly solved the adjustment problem within the GH model with an identity matrix as dispersion matrix. From the resulting 150 × 150 dispersion matrix of the residuals computed as e.g., presented by Malissiovas ([20], p. 46), we take the variances and covariances between the individual point coordinates, but not among the points, i.e., the diagonal elements of each sub-matrix in (78) and arrange them in a new 150 × 150 matrix with "diag()" denoting that only the diagonal elements are taken into account. This is an example of pointwise variances and covariances, described as case (i) in Section 2.3, but now yielding a singular dispersion matrix for the observations with rank (Q LL ) = 100.
Before deriving an iterative WTLS solution for this weighted case, we must check if the criterion (66) for a unique solution of the adjustment problem is fulfilled. Therefore, we computed the 100 × 100 matrix W with rank (W) = 100, and the 100 × 4 matrix A with rank (A) = 4.
The criterion ensures that a unique solution exists when using the presented singular dispersion matrix, while since n = 50 observed points are used in this example, cf. (66). As for all iterative procedures, appropriate starting values for the unknowns must be provided. However, they can be obtained easily by first generating a direct solution with a simplified stochastic model. The iterative WTLS solution for the direction vector of the requested straight line is presented in Table 5. The presented WTLS solution has been found to be numerically equal to the least squares solution within the GH model. Detailed numerical investigations of the convergence behaviour, e.g., in comparison to an adjustment within the GH model, are beyond the scope of this article. However, in many numerical examples it could be observed that the iterative WTLS approach showed a faster convergence rate compared to an adjustment within an iteratively linearized GH model.

Conclusions
For the problem of straight line fitting to 3D point data, two novel WTLS algorithms for two individual weighting schemes have been presented in this study: -Direct WTLS solution for the case of pointwise weights, i.e., coordinate components with same precision for each point and without correlations, -Iterative WTLS solution for the case of general weights, i.e., correlated coordinate components of individual precision including singular dispersion matrices. This algorithm works without linearizing the problem by Taylor series at any step of the solution process.
Both approaches are based on the work of Malissiovas [20], where similar algorithms have been presented for adjustment problems that belong to the same class, i.e., nonlinear adjustments that can be expressed within the EIV model. The approach presented in Section 2.1 provides a direct TLS solution assuming equally weighted and uncorrelated coordinate components. The fact that this assumption is inappropriate, e.g. for the evaluation of laser scanning data, has often been accepted in the past to provide a direct solution for large data sets. With the newly developed approach in Section 2.2 it is now possible to compute a direct WTLS solution at least for a more realistic stochastic model, namely pointwise weighting schemes.
If more general weight matrices must be taken into account in the stochastic model, including correlations or singular dispersion matrices, the presented algorithm of Section 2.3 can be utilized for an iterative solution without linearizing the problem by Taylor series at any step, following the algorithmic idea of WTLS. A criterion that ensures a unique solution of the problem when employing singular dispersion matrices has also been presented, which is based on the original ideas of Neitzel and Schaffrin [26,27], for a solution within the GH model.
Numerical examples have been presented in Section 3 for testing the presented WTLS algorithms. The utilized dataset of the observed 3D point data has also been employed by Snow and Schaffrin [6] for a solution of the problem within the GH model and originates from the study of Petras and Podlubny [30]. The results of the presented algorithms have been compared in all cases with existing solutions or the solutions coming from existing algorithms and have been found to be numerically equal.

Conflicts of Interest:
The authors declare no conflict of interest.