Deformation Analysis Using B-Spline Surface with Correlated Terrestrial Laser Scanner Observations—A Bridge Under Load

: The choice of an appropriate metric is mandatory to perform deformation analysis between two point clouds (PC)—the distance has to be trustworthy and, simultaneously, robust against measurement noise, which may be correlated and heteroscedastic. The Hausdor ﬀ distance (HD) or its averaged derivation (AHD) are widely used to compute local distances between two PC and are implemented in nearly all commercial software. Unfortunately, they are a ﬀ ected by measurement noise, particularly when correlations are present. In this contribution, we focus on terrestrial laser scanner (TLS) observations and assess the impact of neglecting correlations on the distance computation when a mathematical approximation is performed. The results of the simulations are extended to real observations from a bridge under load. Highly accurate laser tracker (LT) measurements were available for this experiment: they allow the comparison of the HD and AHD between two raw PC or between their mathematical approximations regarding reference values. Based on these results, we determine which distance is better suited in the case of heteroscedastic and correlated TLS observations for local deformation analysis. Finally, we set up a novel bootstrap testing procedure for this distance when the PC are approximated with B-spline surfaces.


Introduction
Computing the distance between two objects is an important task in domains such as shape registration [1], shape approximation and simplification [2] or pattern recognition [3]. In the field of engineering geodesy, the distance between objects recorded at different times allows the estimation of deformation magnitudes [4] and their associated risks (see, e.g., Reference [5] for bridges, Reference [6] for dams and Reference [7] for risk management).
The raw point clouds (PC) from a static or kinematic terrestrial laser scanner (TLS) can be analyzed in commercial software. Provided that a registration of the PC is performed (e.g., Reference [8]), maps of deformation magnitudes are formed by building the difference between the PC recorded at two different epochs and allows visualization of the corresponding strength of deformation. The metric to compute the distance between PC is usually based on cloud to cloud (C2C), cloud to mesh (C2M) or mesh to mesh (M2M) strategies. The Multiscale Model to Model Cloud Comparison (M3C2), implemented in CloudCompare, is a possibility for assessing signed distances by smoothing the PC in a predefined zone [9]. The reader can refer to Reference [10] for a description of the different 1.
choice of the mathematical approximation of the PC. In the field of geodesy, the regression B-spline approximation, as introduced by Reference [20], allows great flexibility to model raw TLS observations: no predetermined geometric primitives, such as circles, planes or cylinders, restrict the fitting [21]. Other strategies exist, such as penalized splines [22] or patches splines [23]. They seem less suitable for applications with noisy and scattered observations from TLS PC: please refer to Reference [24] for a short review of the different methods); 2.
choice of the distance [25]. The distance chosen has to fulfil certain conditions, such as being robust against noise and outliers to ensure its trustworthiness, particularly when the objects are close to each other [26]. Furthermore, it should correspond to the problem under consideration, that is, shape recognition or image comparisons may require another definition than object matching applications [27]. When a complex object is modelled, maps that allow for a visualization of pointwise deformation magnitudes to detect changes are more relevant than a global measure of distance [28]. Distances based on the maximum norm of parametric representations may not estimate the real distance correctly [29] and cannot be applied to piecewise algebraic spline curves [30]. An alternative is the widely used Hausdorff distance (HD) to estimate either the distance between two raw PC or their B-spline approximations [31]. Unfortunately, the traditional HD only provides a global measure of the distance and is known to be sensitive to outliers. Alternatives were proposed, including the Hausdorff quantile [32], for close objects [26], for the specific case of B-spline curves [33], spatially coherent matching [34] or the averaged Hausdorff distance (AHD; [27]).
Parametric approximations of PC and mathematical definitions of distances are often associated with an increase of complexity. Fortunately, the additional effort involved to perform a deformation analysis with mathematical approximations of TLS PC rather than with the raw observations is worthwhile. In this contribution, we aim to convince a practitioner from it by answering the following three scientific questions: (1) Is a mathematical approximation of the noisy TLS PC beneficial for a trustworthy distance computation? (2) How does correlated noise affect the distance between mathematical surfaces? Which metric is better suited in the case of correlated observations? (3) Which specific statistical test has to be applied when testing for deformation based on a distance between mathematical approximations of TLS PC?
We will build our answers for (1) and (2) on both simulations and real data analyses and focus on local approximation. In a first step, we will simulate the PC of TLS raw observations in polar co-ordinates. They are known to be both heteroscedastic ( [12,35]) and correlated ( [15,36]). Thus, we will extend the stochastic model of TLS measurements using a separable covariance function to simulate correlated range observations ( [37,38]). We will use Monte Carlo (MC) simulations to generate random Remote Sens. 2020, 12, 829 3 of 23 vectors for the correlated noise by means of a Cholesky factorization. This allows us to analyze the impact of correlations on the HD-based distances in a general case, as well as to determine which mathematical distance is the most trustworthy in the case of correlations.
In a second step, we will confirm the simulation results using small surfaces of real data from a bridge under load. We will draw a parallel between the correlations and density reduction of the PC via gridding and show how gridding positively affects the distances computed with a mathematical approximation of the raw TLS observations. Reference deformation magnitudes are available from a pointwise laser tracker (LT) for the sake of comparison.
The estimation of the deformation magnitude itself is meaningless without assessing its significance, that is, answering the question (3): "can the null-hypothesis that no deformation occurs be rejected or not?". In this contribution, we propose a novel and specific test strategy based on the HD or AHD between mathematical approximations. Because the distribution of the test statistics is not tractable, we will combine MC simulations with a bootstrapping approach to validate a rigorous test procedure for testing deformation.
The remainder of the contribution is as follows: in Section 2, we introduce the mathematical approximation of curves and surfaces using B-splines basis functions, focusing on the regression B-splines. The HD and AHD will be presented. In Section 3, the theoretical derivations will be applied and validated by means of simulations of PC with different correlated noises for a null-deformation case. Section 4 is dedicated to a real case study of surface fitting using TLS observations from a bridge under load. A comparison between the deformation magnitudes obtained by gridding the raw PC and the values provided by the LT will highlight the potential of the AHD, provided that a local approximation is performed with a reduction in point density. The specific bootstrap testing procedure for the AHD is described in Appendix B.

B-Spline Curves
We start with the fundamental problem of having m observed points that we wish to approximate by a smooth curve thanks to an efficient and numerically stable method ( [20,39]). In this contribution, we will make use of the B-splines basis: it offers a powerful local control thanks to the control points (CP). We defined the corresponding curve C(x) as: where p i is the ith CP from a total of n. B i,d = B i,d,t the B-spline function of degree d depending on the non-decreasing sequence of real numbers , called knots. x is an independent variable, which is varied in the interval [t i , t i+1 ). B i,d is given by the recurrence relation [39]: Observations in real applications are the results of noisy measurements. We solve the approximation problem in least-square (LS) sense with the following minimization problem [20]: with C(x) being an element of S d,t , the linear space of all linear combinations of B-splines defined by S d,t = span B 1,d , . . . , B n,d . min means minimum, is the observation vector with x 1 < . . . < x m and w i the ith corresponding weight. In a more general matrix form, (2) is equivalent to finding the vector of length n of CP p = [p 1 , . . . , p n ] that solves the linear weighted LS problem: with · being the usual Euclidean distance. The elements of the matrix A ∈ R m,n are a i, The solution of (3) gives the estimated CP vectorp 0 by means of with · T for transpose. We have E p 0 = p, E(·) being the expectation operator. Since the true Σ is unknown, the feasible LS estimatorp =

The Parametrization of the Point Cloud
Parametric B-spline curves take values in R 2 and are defined by letting the CP be points in R 2 instead of real numbers. Intuitively, the parameter provides a measure of the time to travel along the curve and can be adapted to the data. Different methods can be used (e.g., uniform, cord length or centripetal parametrization; [40]). They all have shortcomings, which should not be underestimated for complicated geometries [41]. An exhaustive description of the parametrization is beyond the scope of the present paper.

The Number of CP
The CP build a control polygon: this is a rough scheme of the curve itself. Moving one CP influences the curve locally and not globally. The number of CP to estimate is linked to the length of the knot vector and can be iteratively adjusted. An optimal CP number should avoid the fitting of measurement noise. Information criteria (IC; [42]) are an alternative to heuristic methods and provide a useful tool to assess this optimal number. Two criteria are widely used-the Akaike information criterion (AIC), which minimizes the Kullback-Leibler divergence of the assumed model from the true, data-generating model or the Bayesian information criterion (BIC). The latter is based on the inherent assumption that the true model exists [43]. They are defined in their common versions as: where l p is the log-likelihood of the estimated parameters. Using this formulation, a minimum is searched.

B-Spline Surfaces
We construct parametric B-spline surfaces as tensor product surfaces depending on the B-spline functions. The approximation method for curves described in the previous section can be generalized to R 3 , provided that a suitable parametrisation (u, v) for the discrete data has been chosen. The parametric B-spline surface S (u,v) is expressed as are the knot vectors associated with the B-spline functions assumed to be of the same degree d, for the sake of simplicity. p ij is the CP vector in R 3 and n, r are the number of CP to estimate in the u and v direction, respectively. We define z as the value of the surface S (u,v) at (x, y). Without a lack of generality, we will skip the subscript (u, v) from now. The LS approximation method can be used for fitting surfaces to scattered and noisy data in R 3 . Due to the definition of the surface by means of a tensor product, the minimization problem is directly related to the univariate one and is only a generalization of the methodology described in Section 2.1.
We will restrict ourselves to cubic B-splines, that is, d = 3. They are considered to be optimal for approximating smooth objects without sharp edges and corners. The observations are often gridded in advance to avoid the problem of solving a large system of equations. In this contribution, both gridded real observations, non-gridded real data and simulated observations from a TLS will be used.

Deformation Analysis
We apply the previous theoretical development to TLS observations starting with two raw PC of the same object recorded at different epochs. The polar observations are transformed into Cartesian and approximated with B-spline surfaces following the methodology presented in 2.2. For the purpose of deformation analysis, we compute the distance-also called the "deformation magnitude"-between these two approximated surfaces S 1 and S 2 . In this section, we briefly discuss different methods of distance computation.

Suboptimal Intuitive Approaches
We introduce two intuitive approaches to compute the distance between surfaces:

•
The first one makes use a gridded PC and is defined as the difference between the z co-ordinates of S 1 and S 2 , where · is the Euclidian norm. z 1 i,j and z 2 i,j are the values of S 1 and S 2 at grid points x 1 i , y 1 i and x 2 i , y 2 i , respectively. We note that S 1 and S 2 may have been computed with different optimal numbers of CP, i.e., we may have n 1 n 2 , m 1 m 2 . Due to the gridding, D grid should only be used when the deformation can be assumed to be unidirectional (i.e., in the z-direction).
• A second idea is to define the distance at the parameter level between the estimated vectors of . Clearly, D CP is meaningless when n 1 n 2 , m 1 m 2 , since the size of the two control polygons differs.

The Hausdorff Distance
In order to overcome the drawbacks of the two intuitive aforementioned approaches, we propose to quantify the deformation magnitude between two parametric B-spline surfaces by computing Remote Sens. 2020, 12, 829 6 of 23 their HD (see, e.g., References [29][30][31]). First, we define the distance D po 1 , S 2 between a point po 1 = x po 1 y po 1 z po 1 T belonging to S 1 , and S 2 as D po 1 , S 2 = min po 2 ∈S 2 po 1 − po 2 , where po 2 = x po 2 y po 2 z po 2 T is any point of S 2 . From this definition, the HD between S 1 and S 2 is obtained by taking the maximum: It is convenient to introduce the symmetrical HD as D s (S 1 , S 2 ) = max(D(S 1 , S 2 ), D(S 2 , S 1 )). The computation of a one-sided distance leads potentially to an underestimated value [44]. Ref [30] show that the HD is related to the computation of binormal lines for parametric surfaces. These lines are defined as a normal line at both po 1,0 at parameter (u 0 , v 0 ) and po 2,0 . Thus, after having detected the so-called antipodal points po 1,0 and po 2,0 , the minimum of the distance can be easily computed. This distance is, therefore, independent of the number of CP used to approximate each data set.
We will denote by S 1 x j HD , y j HD the value z 1 j HD at the point j HD where the HD occurs in S 1 and similarly by S 2 x i HD , y i HD at point i HD of the occurring HD in S 2 . Note that i HD may differ from j HD .

The Averaged Hausdorff Distance
The maximum distance involved in (7) to compute the HD can be distorted by the noise of the observations and will not accurately reflect the global deformation between two objects. Rather than an HD, we propose to estimate the AHD, that is, an averaged value of the HD [24]. The AHD is based on the mean value of the distances defined as D mean (S 1 , S 2 ) = mean Our choice is justified by the fact that the AHD is known to be less sensitive to observation noise [27].

Note
Nearly all predefined distances in standard software for raw PC processing are based on the HD or the AHD, either with or without simplistically local mathematical approximations based on planes [45]. B-splines surface approximation allow a more general and detailed description of the PC than the local strategies used in conventional software.

Simulations
In this section, we will analyze the HD and AHD between two simulated noisy PC and their approximations. We wish to answer the first two questions raised in the introduction in a controlled framework. To that aim, we will investigate if a mathematical approximation is beneficial for a trustworthy distance computation and how the correlated noise affects the distance computation.
Our simulations are based on the generation of two "no deformation" sets of raw correlated and uncorrelated reference observations in polar co-ordinates. We describe the specific correlation model used to generate correlated TLS observations in Appendix A.

Generating Noisy Surfaces
We generate sample points from a given mathematical reference surface S simu . This latter is assumed to correspond to the probability density function of a two-dimensional normal distribution with a mean of µ = [ 5 5 ] T , T states for transpose and VCM of Σ s = 0.2 0 0 0.2 . The choice of

Σ
The set-up of noise Σ is mandatory to generate the noisy surfaces. We simulate three kinds of VCM with an increased degree of complexity: (i) Simple VCM:  (iii) Complex VCM degree 2: noise TC = Σ Σ , assuming, in addition to (ii), also temporally correlated polar observations. These matrices are further described for a better understanding in the next sections.

Case (ii)
We follow [38] build the diagonal VCM  We generate two realizations of the same surface S simu having the same stochastic properties. The methodology can be summarized as follows:

Note
Nearly all predefined distances in standard software for raw PC processing are based on the HD or the AHD, either with or without simplistically local mathematical approximations based on planes [46]. B-splines surface approximation allow a more general and detailed description of the PC than the local strategies used in conventional software.

Simulations
In this section, we will analyze the HD and AHD between two simulated noisy PC and their approximations. We wish to answer the first two questions raised in the introduction in a controlled framework. To that aim, we will investigate if a mathematical approximation is beneficial for a trustworthy distance computation and how the correlated noise affects the distance computation.
Our simulations are based on the generation of two "no deformation" sets of raw correlated and uncorrelated reference observations in polar co-ordinates. We describe the specific correlation model used to generate correlated TLS observations in Appendix 1.

Generating Noisy Surfaces
We generate sample points from a given mathematical reference surface simu S . This latter is assumed to correspond to the probability density function of a two-dimensional normal distribution with a mean of T states for transpose and VCM of The choice of the reference surface is justified to avoid oscillations of the approximation due to sharp edges or variations [47]. The x-and y-co-ordinates are considered to be uniformly sampled with a resolution of 0.5 in the interval   1 10 . The corresponding surface is shown in Figure 1 left.
We generate two realizations of the same surface simu S having the same stochastic properties.
The methodology can be summarized as follows: We add noise to the predefined grid surface points of G G [47]. We generate the vector We add noise to the predefined grid surface points of S simu to obtain a pointwise sorted gridded surface S simu1 : S simu1 = S simu + G T W noise,1 . N 1 = G T W noise,1 is called a realization of the noise, where G T is the lower triangular matrix of the Cholesky factorization of the noise VCM Σ noise , with Σ noise = G T G [47]. We generate the vector W noise,1 of the same size as S simu1 with the Matlab's random number generator through randn(0, 1), which is the realization of a normal distributed vector with a mean of 0 and a variance of 1. The surface S simu2 is built similarly using a second white noise vector W noise,2 . We call N 2 = G T W noise,2 the second realization of the noise.

Generating the Reference Noise VCM Σ noise
The set-up of Σ noise is mandatory to generate the noisy surfaces. We simulate three kinds of VCM with an increased degree of complexity: The Identity matrix I is scaled by a factor σ 2 ρ defined in the next section. (ii) Complex VCM degree 1: Σ noise = Σ MAC , assuming heteroscedasticity of the raw polar observations and mathematical correlations (MAC) due to the transformation to Cartesian coordinates in the B-spline approximation. (iii) Complex VCM degree 2: Σ noise = Σ TC , assuming, in addition to (ii), also temporally correlated polar observations. These matrices are further described for a better understanding in the next sections.

Case (ii)
We follow [37] build the diagonal VCM Σ noise,polar of the polar observations Σ noise,polar . We call ρ the range and VA and HA the vertical and horizontal angles, respectively. We assume a constant standard deviation (STD) of σ HA = σ VA = 2.5 mgon for both normally distributed angles. Two STDs σ ρ for the range are chosen to build Σ noise,ρ = σ 2 ρ I following the manufacturer's specifications of a Zoller+Fröhlich Imager 5006H TLS: (1) σ ρ,1 = 0.7 mm, which corresponds to an object observed at a close distance (<10 m) and (2) σ ρ,2 = 7 mm, for an object scanned at a distance greater than 25 m or under unfavorable scanning conditions. Starting from Σ noise,polar , we, furthermore, make use of the error propagation law to compute the VCM of the transformed Cartesian observations, that is, Σ noise = Σ MAC . This step is justified by the need to use Cartesian observations to compute the B-spline surface (Section 2). The same two range variances are used to scale the Identity matrix of case (i) for the sake of comparison between models. In these simulations and in the following case study, we will assume the range variance σ 2 ρ to be known; rough estimations of the range variance are available in real cases using the intensity model or manufacturer's specifications.

Case (iii)
As many effects (atmospheric, surface or sensor-based) can potentially act on correlating the range measurements, the assumption of heteroscedasticity only made in case (ii) is fairly unrealistic. Assessing the correlation structure of the TLS range with a general model is a complex task. An empirically-based method was proposed in Reference [36]: the residuals of a LS adjustment of a scanned plane were fitted by an exponential function. This function is known to have a substantial limitation in most geostatistical studies due to the small degree of smoothness of the covariance function [38]. Additionally, methods using an empirical fitting of the autocovariance function have severe drawbacks in the case of fractional noise. In this contribution, we follow [16], who model the correlation structure of Global Navigation Satellite System observations with a general Matérn covariance function [48]. An analogy drawn between TLS and Global Navigation Satellite System observations makes the application of this flexible function to describe the structure of TLS range correlation plausible [17]. The two parameters-smoothness and range-involved in the Matérn model are presented in Appendix A.
Our Assumptions

•
We model the correlation of the range as being temporal, that is, time-dependent. Range measurements are a measure of time [49]: any spatial effects stemming from the reflected surface can be included in the variance factor. This latter could exemplarily follow the physically plausible intensity model, as proposed in Reference [12].

•
The covariance function proposed is said to be separable, that is, it separates the temporal from the spatial effects [50]. We will here assume a temporal spacing of 1 s between the simulated observations.
Building the VCM We build the fully populated VCM Σ noise,ρ by associating the time label ti at which the measurement was made to each range observation (see Figure 1 right). In a first approach, we neglect the time taken by the sensor to go back to the second column and consider the first point of the second column to be equally spaced regarding the observations of the first column. Including this short time offset acts to decrease the correlations, that is, makes the results obtained closer to case (ii).
Finally, we build the fully populated pointwise sorted VCM of the range measurements from the vector of correlations. This VCM has a Toeplitz structure and is scaled so that the variance of the range ρ corresponds to the two cases described previously (see (ii)). Similar to case (ii), the VCM of the raw TLS observations is transformed by accounting for mathematical correlations. This leads to a fully populated VCM of the Cartesian co-ordinates Σ noise = Σ TC .  2] for which the correlations prevail for larger lags. We intentionally consider mean-squared differentiability of the Matérn covariance function at the origin (near ti = 0) by taking ν > 1 [38]. Taking ν < 1 imposes is a strong limitation, since the correlation length decreases sharply at the origin leading to sparse VCM and inverses. This effect decreases the impact of fully populated matrices in the LS adjustment and statistical tests [51]. The study of the temporal correlation structure of TLS range measurements is beyond the scope of this paper and will be done in a next contribution.

Approximated VCM in the LS Adjustment
The assumption that the true VCM of the raw observations is known is fairly unrealistic in real application. Thus, we propose to additionally assess the impact of an approximated VCM on the distance computation derived from the regression B-splines approximation. We gradually mis-specified the true VCM, as presented in Table 1. This Table has to be read as follows: for case (ii) the true VCM is Σ = Σ MAC and is simplified using the scaled identity matrixΣ = σ 2 ρ I. For case (iii) we simulate two steps of simplification: firstly, we neglect the temporal correlations (Σ = Σ MAC ) and, secondly, the mathematical correlations (Σ = σ 2 ρ I). or σ 2 ρ,2 .

Determining the Optimal Number of CP Using Information Criteria
We simulate a total of four PC for cases (i) and (ii) and four PC for case (iii) with two different correlation structures and range variances. We compute 100 runs of each simulation with an MC approach. One run here corresponds to the generation of two epochs simultaneously.
The mathematical modelization of the simulated PC is performed using the regression B-spline surface approximation developed in Section 2. The parametrization is made with the chord length method, which gives satisfactory results for regular and rectangular-shaped PC. The knot vector is determined using the method of Piegl and Tiller [40]. The optimal number of CP in the two directions is iteratively determined for each of the eight cases with the BIC and AIC approaches (see Section 2.1.2). Since the AIC gave the same results as BIC, the results are not shown for the sake of brevity.
The results given by the BIC are presented in Table 2 and are identical for each MC run. Correlated and Gaussian noise vectors lead to a different optimal number of CP.

Results
The means over all MC runs of D s (S simu1 , S simu2 ) and D s (S 1 , S 2 ) are calculated, as well as D s_ave (S simu1 , S simu2 ) and D s_ave (S 1 , S 2 ). These values correspond to the distance between the approximated surfaces or the distance between the raw PC. Additionally, the STDs of the series obtained from the 100 MC runs are given. The stochastic models used to approximate the surfaces are varied according to Table 1.
In Table 3, we intentionally choose to present only the results from the extreme case (iii), corresponding to a high correlation level and a high range variance. This is justified for the sake of the brevity and clarity of this contribution. The other results are deduced from this particular one and are summarized in a text form in the following paragraphs. We expect the HD and AHD between the mathematical approximations to be as close as possible to 0, since the simulated PC corresponds to a "non-deformation" case. Any discrepancy can be assigned to the LS solution itself when noisy observations with the wrong VCM are approximated. Additionally, the chosen distance may be inappropriate. The results of Table 3, as well as the one of the other simulations -described in text form-, are interpreted in this light.

Use of a Correct VCM
When we use the correct VCM to approximate the PC with the regression B-spline surface (Table 3, second line), the distances (HD or AHD) are close to 0. For case (iii) and σ ρ = 0.007 m, it reaches 0.0029 m for the AHD but a higher value of 0.0084 m for the HD. We find additionally 0.0011 versus 0.0012 m for σ ρ = 0.0007 m (STD 3 × 10 4 m) for AHD and HD, respectively. For case (ii) which corresponds to σ ρ = 0.007 m and a reference Σ noise = Σ MAC , the AHD reaches 0.0046 m and the HD 0.0232 m (STD 3 × 10 4 and 4 × 10 3 m respectively). This is a stronger difference compared to case (iii). For case (i) −σ ρ = 0.007 m, Σ noise = I-, the AHD reaches 0.1097 m and the HD 0.3534 m (STD 3 × 10 3 for both). Thus, we clearly see that the AHD gives values that are closer to 0 than the HD for all cases under consideration.
Use of An Approximated VCM As described in Section 3.3, we use approximated VCMs in the LS and distance computation ( Table 3, third and fourth line). In the latter case, we can distinguish that: • under correlated noise, the approximated VCM used in the LS computation affects the determination of both the HD and the AHD strongly: the difference ratio reaches for the case iii) more than 75% for the HD and 200% for the AHD. This result was found to hold true for all cases under consideration, that is, independently of the correlation structure and the variance factor. Thus, a correct stochastic model is unavoidable for a trustworthy distance. Exemplarily for case (iii) with [α, ν] = [0.01, 2] and σ ρ,1 = 0.007 m, the ratio of the difference between the approximated and the reference distance to the reference forΣ = Σ MAC orΣ = σ 2 ρ I reaches 200% for the AHD (Table 3). Decreasing the correlation length decreases the ratio: for case iii) and [α, ν] = [1,2], this latter is found only 7% smaller than the reference value when the VCM is mis-specified (Σ = Σ MAC orΣ = σ 2 ρ I). This result is found to be independent of the σ ρ chosen. For σ ρ,2 = 0.0007 m, the same ratio is 10% smaller: a small range variance impacts the distance computed with a mis-specified VCM less strongly.

•
When the observations are only MC, simplifying the stochastic model by neglecting the mathematical correlations, that is, takingΣ = σ 2 ρ I, did not affect the HD or the AHD significantly for σ ρ,2 = 0.0007 m. By increasing the range STD to σ ρ,1 = 0.007 m, the ratio for the AHD was increased by 15%. This result highlights the importance of accounting for mathematical correlations under unfavorable scanning conditions, that is, high range variance.
Independently of the case under consideration and the approximated VCM used, the AHD was always about four times smaller than the HD and, thus, closer to the expected 0 value. The AHD was less influenced by a wrong stochastic model than the HD, except for case iii) and σ ρ,2 = 0.0007 m. We attribute this finding to the averaging effect of the AHD.

Impact of the Simplified Stochastic Model on the HD and AHD: PC
When the distances are computed based on the raw observations, we still expect the AHD and the HD to be as close as possible to 0. Discrepancies are due to the noise introduced to generate the simulated PC and depend on the distance chosen.
The HD and AHD based on simulated PC have both higher values and STD compared with the values obtained with a mathematical modelization (see Section 3.5.1). This result is particularly significant when the PC noise is correlated. Indeed, a difference of up to 135% for the AHD could be obtained for case iii) and σ ρ,1 = 0.007 m ("PC," last line in Table 3). For case iii) and σ ρ,2 = 0.0007 m, we find a difference of 85% for the AHD and more than 400% for the HD. For case (ii), differences of 80% and 30% for the AHD and HD, respectively, are reached for both σ ρ,1 and σ ρ,2 . For case (i), we notice a difference of 22% versus 75% for the AHD and the HD, respectively. This finding highlights the filtering effect of surface approximations on the underlying PC noise. It is, thus, particularly advantageous to approximate the PC mathematically for distance computation in the case of correlations.
We note more generally that correlations also had a positive effect on the distance computation with raw observations. In case (iii), a decrease of both its value and STD regarding case (ii) or (i) was identified. Exemplarily, we find for σ ρ,2 = 0. When the raw observations are used, correlations act implicitly as a reduction of the available information, that is, similar to a point density reduction [9]. In real applications, they are related to the gridding of the raw observations, which reduces the number of observations of the PC. This implication is further developed in Section 4 with real observations from a bridge under load.

Statistical Testing for Deformation
In Appendix B, we propose a rigorous statistical test for the significance of the distance between mathematical approximations of TLS observations. We applied this derivation to the simulated PC. The H 0 that no deformation occurs was fortunately strongly supported for all cases under consideration. The test values varied between 0.3 and 1, whereas the smallest ones were obtained for the correlated cases (iii) under the assumption thatΣ = σ 2 ρ I. This highlights once more the importance of an adequate stochastic model, particularly in the presence of correlations, although the absolute pv values should only be overinterpreted [51].

Conclusions of the Simulations
Using the results of the simulations, we provide a first answer to the following questions: • How does correlated noise affect the distance? Which distance is better suited in the case of noisy observations?
Based on the results of the simulations and when the raw observations are used (PC), correlated noise affects the distance computation positively. It has a similar effect as a reduction of the observations available. When a mathematical approximation of the PC is performed, the best stochastic model should be used in the LS adjustment to assess a trustworthy distance. The impact becomes less pronounced by decreasing range variance and correlation length. The AHD is better suited than the HD for computing the distance between raw or approximated PC.

•
Why should we use a mathematical approximation of the noisy PC?
A mathematical approximation of the PC using, for example, B-spline surfaces is beneficial to assess a distance as close as possible to the reference value. Moreover, it allows the derivation of a rigorous statistical testing procedure based on the distance chosen, as developed in Appendix B and validated within a simulation framework.

Case Study
The previous simulations, for which the noise structure was known and controlled, have highlighted the impact of correlations on the HD and AHD. In this section, we propose to apply these derivations to a real case study.
We will analyze the HD and AHD computed with and without mathematical modelization of a real PC. We will compare the values with a reference one obtained with a more precise sensor: a pointwise LT. The rigorous statistical test procedure for deformation will be further applied.

A Bridge Under Load
We use real data from a bridge under load to assess the advantages of a mathematical modelization regarding processing the raw observations to estimate and test magnitudes of deformation.
The data set corresponds to a historic masonry arch bridge over the river Aller near Verden in Germany. Deformations were artificially generated by increasing load weights on specific parts of the bridge to simulate the impact induced, for example, by car traffic [52]. In the scope of the load testing the standard load of 1.0 MN (100 t) was defined and further loadings up to five-times the standard load were realized. Thus, a maximum load of approximately 6.0 MN was defined, produced by four hydraulic cylinders mounted on the arch. The TLS profiles were captured using a Zoller + Fröhlich Imager 5006H at a sampling rate of 500,000 points per second. In a pre-processing step, objects such as prisms were removed from the PC to achieve a clean dataset: this filtering with respect to objects on the arch surface eliminated interfering objects, that is, other sensor installations like prisms for the laser tracker and strain gauges. The first evaluation step of the 3D point clouds in post-processing was the referencing of the 3D point clouds in the coordinate system of the structure. The corresponding results are shown in Reference [52], Table 1. The mean standard deviation of the 3D points was 0.2 and maximum of 0.4 mm, which shows the quality of the referencing and guarantees at the same time a stable laser scanner position during the load test. In this contribution, we intentionally focus on the deformation between the reference PC without load (called E00 for epoch 0) and the PC corresponding to the maximum deformation occurring at the 5th epoch (E55). Further details about the experiment can be found in Reference [5], with a comparison for all load steps of the LT deformation magnitudes and the M3C2 distance. Figure 2 is a photograph of the bridge, with a localization of the two LT points under consideration. The TLS was positioned approximately in the middle of the bridge so that the parts under load could be optimally scanned at a short distance, that is, from 5 m in the up-direction. We aim to compare the HD and AHD with the deformation magnitude obtained with a highly accurate LT. As this latter measures a pointwise distance, we selected two small surfaces (quadratic patches) of 25 × 25 cm from the whole PC in the direct neighborhood of the two LT points L8 and L13. The zone around L8 was scanned with a less favorable geometry than the patch around L13 regarding We aim to compare the HD and AHD with the deformation magnitude obtained with a highly accurate LT. As this latter measures a pointwise distance, we selected two small surfaces (quadratic patches) of 25 × 25 cm from the whole PC in the direct neighborhood of the two LT points L8 and L13. The zone around L8 was scanned with a less favorable geometry than the patch around L13 regarding point density, incidence angle, footprint size and range (see Figure 2).
These two points were chosen intentionally due to: • their comparable and small deformation magnitudes of approximately 4 mm between step E00 and 55 around the reference LT point and • the two different scanning geometries.
Please note that we do not intend to make a systematic investigation of the impact of the geometry on the quantities of interest here and hence, no further indication will be given. Our goal in this contribution is to compare the HD and AHD with the LT values and validate a procedure for testing deformation. Further investigations are left to the next and other specific contributions.

Mathematical Modelling
A parameterization was carried out using a uniform method, which is justified by their relatively smooth and uncomplicated geometries, in order to mathematically approximate the small patches with B-spline surfaces. We chose an equidistant knot vector for the same reason. A B-spline approximation is preferred instead of a Gauss-Helmert Model [53], since the surfaces are not exactly planar and cannot be exactly approximated with an inclined plane.
Three strategies were adopted for the surface fitting to simplify the computation and reduce the point densities of the PC: • In a pre-processing step, the extracted PC were gridded, that is, the X-and Y-axis were each divided into ten steps. For each of the 100 cells, the means of the X, Y, Z values were computed to reduce the number of observations. The value of 10 was chosen as the highest one leading to the occurrence of at least one point in each cell.

•
The PC extracted were gridded similarly to (i) but the X-and Y-axis were divided into 5, which corresponds to 25 cells.

•
The whole PC were used without gridding, that is, no reduction of the PC point density was performed.
The BIC was used to compute the optimal number of CP. For (i) and (ii), the estimation of 4 CP in both directions was found to be optimal for the patches, whereas for (iii), 6 CP were estimated in both directions for L8 and L13, respectively.
The different mathematical surfaces obtained with and without gridding are shown in Figure 3. Figure 3 right highlights how the gridding of the PC (case (i)) affects the fitting by smoothing or filtering the PC. Figure 3 left shows more details of the surface as all available scanned points are used (case (iii)). values were computed to reduce the number of observations. The value of 10 was chosen as the highest one leading to the occurrence of at least one point in each cell. • The PC extracted were gridded similarly to (i) but the X-and Y-axis were divided into 5, which corresponds to 25 cells. • The whole PC were used without gridding, that is, no reduction of the PC point density was performed. The BIC was used to compute the optimal number of CP. For (i) and (ii), the estimation of 4 CP in both directions was found to be optimal for the patches, whereas for (iii), 6 CP were estimated in both directions for L8 and L13, respectively.
The different mathematical surfaces obtained with and without gridding are shown in Figure 3. Figure 3 right highlights how the gridding of the PC (case (i)) affects the fitting by smoothing or filtering the PC. Figure 3 left shows more details of the surface as all available scanned points are used (case (iii)). When a gridding of the PC is performed, the temporal correlations are lost: an averaging of the values within one cell is performed and the time matching becomes meaningless. We, therefore, make

Note on the Stochastic Model:
When a gridding of the PC is performed, the temporal correlations are lost: an averaging of the values within one cell is performed and the time matching becomes meaningless. We, therefore, make use of the simplified stochastic model corresponding to case (ii) and account only for heteroscedasticity and mathematical correlations in the LS adjustment. The intensity model is used to compute the range variance [37]. As expected from the scanning geometry and because the TLS was situated under the middle of the bridge (Figure 2), we obtained a large mean intensity of 1,557,500 Inc for L13, leading to σ L13 ρ ≈ 0.5 mm, whereas for L8, the mean of the intensity reached 358,900 Inc corresponding to σ L8 ρ ≈ 1 mm. For case (iii) (B-spline approximation without gridding), we chose intentionally to neglect correlations to compute the mathematical approximation of the PC. This is justified by the computational burden associated with fully populated VCM and relatively low impact on the distance for the range variance under consideration (less than the submm level, see 3.6.3).

Computation of the HD and AHD
The HD or AHD computed with the three gridding strategies are not expected to give similar values. Due to the smaller reduction of the PC point density, HD and AHD from the B-spline approximation (i) will be closer to the values obtained with the PC (case (iii)). Because the HD is a local distance measure, a stronger influence on unexpected local details is anticipated, particularly when few points are condensed in a grid cell. In the simulation section, we stated that correlations were acting to reduce the number of observations available and affected the distance computation with raw observations positively. This effect is similar to a gridding of the PC and allows us to conjecture that an optimal gridding exists leading to the reference value of the distance. The reference value corresponds here to the pointwise LT distance.
The mathematical approximations of the PC will lead to a distance closer to the reference one when optimal gridding of the raw PC is performed.
The corresponding results are presented in Table 4 and confirm these expectations. Both HD and AHD values are compared with the LT values (last column) for the four cases under consideration. In the case of gridded observations (Table 4, first line), the LT deformation magnitudes are closer to the AHD than to the HD. The HD (Table 4, third column) is higher than the AHD (Table 4, second column) in all cases.
A reduction of the PC to 16 values pro cell (L8 case (i)), Table 4, first line) leads logically to an AHD closer to the value obtained without mathematical approximation (PC). It is nearly 0.2 mm over the value given by the LT (4.29 mm versus 4.07 mm). We link this effect to the lower noise reduction regarding (ii).
A high point averaging corresponding to 300 PC points in a cell (L13 case (ii)) is associated with a low AHD. This latter is smaller by 0.23 mm compared with case (i) ( Table 4, third line). However, the difference is below the noise variance of the range and should not be overinterpreted. Similarly, we found an underestimated value of 3.8 mm for point L8 by averaging to 600 points per cell (not presented in Table 4). Thus, the loss of information due to a strong PC density reduction affects the AHD negatively when compared with the LT deformation magnitude.

No Gridding, Case (iii)
When the whole PC is used for surface fitting rather than a gridded version, the AHD are higher by up to 0.5 mm for L8 and 0.3 mm for L13 compared with the optimal values obtained with an approximated gridded PC (Table 4, second column). These values are below the noise variance of the range but show the effect of the PC smoothing on distance computation (e.g., Figure 3).
From Table 4 (third column), the HD for case (i) and (iii) are higher than for case (ii). This gives an additional argument in favor of the AHD, that is, the averaging decreases the impact of potential local artefacts when compared with LT pointwise deformation magnitude.
In Table 4, last column, we added the results found with the usual method M3C2 [10]. The results show an underestimation of the distance with 3.20 mm versus the LT value of 4.07 mm for L8 and 4.70 mm versus 4.96 for L13.

Testing for Deformation
Even if the deformation magnitude is obvious regarding the estimated noise STD for both L8 and L13, we aimed to validate the testing methodology presented in Appendix B. We, thus, make use of the bootstrapping approach to derive the p-values of the a priori test statistics T HD and T AHD under the stochastic modelΣ MAC with the estimated σ ρ . Following the simulations, we use K BS = 99 samples to test for the significance of deformation. The bootstrap sample generated under H 0 "no deformation" is defined as the average of the two surfaces E00 and E55 for the two points under consideration. No evidence for H 0 "no deformation" could be identified, as the p-values reached for the AHD were approximately 0 and were far below the critical value α test of 0.05. We can conclude that the deformation magnitude based on the AHD is statistically significant.

Discussion
In this case study, we found a number of points, around 60-70 optimal in each cell (L8 case (i)) and L13 case (ii)), to ensure an AHD close to the deformation magnitude obtained with the LT. This finding is far-reaching when comparing mathematical approximations of the TLS PC and LT values is intended: • an optimal grid setting for a good correspondence between the deformation magnitudes computed from two different sensors exists: a higher point density may lead to different point correspondences in the two epochs, particularly in the case of a small deformation. The optimal size of the cell depends on the point density inside one cell and could be assigned by means of calibration based on sensors comparison (LT and TLS).
• We further pointed out that the AHD is less influenced by a suboptimal fitting, that is, inappropriate parametrization, knot vector or number of CP and is more trustworthy for local deformation analysis than the HD. This finding confirms the results from the previous simulations: the AHD is more appropriate than a maximum value (the HD) for the sake of comparison with LT values. This is due to the averaging of the AHD when a local deformation analysis is performed. A statistical test of significance of deformation should be based on this distance.
We noticed that the point density reduction affects the distance computed with B-spline approximations positively, up to a given stage where not enough information is available for a correct fitting. This confirms our conjecture that optimal gridding of the raw PC exists for which the AHD corresponds to the reference value, that is, an implicit account for correlations: Using standard setting, we found an underestimation of the deformation with the M3C2 method. This finding is coherent with our results about the point density inside one cell. Consequently, we would strongly recommend performing local fitting when magnitudes have to be precisely estimated. Consistency regarding the point density and distance computation is mandatory for the sake of comparison between deformation magnitudes obtained with different sensors.

Conclusions
The potential of TLS-based deformation analysis is high due to the fast and simple data acquisition, the high point density and the possibility of scanning whole areas of interest. B-spline surfaces can approximate the PC mathematically for rigorous statistical testing of deformation and to filter the noise of TLS observations.
A numerical evaluation of the magnitude of deformation can be obtained by computing a distance between the PC or their approximated counterparts. In this contribution, we focused on the HD and the AHD. The latter was shown to be a powerful alternative to the HD in the case of correlated observations. Three questions were answered: • A mathematical approximation of the noisy TLS PC is beneficial for a trustworthy distance computation: B-spline surface approximation from scattered PC acts as filtering the correlated and heteroscedastic noise from TLS observations. The AHD computed was closer to the reference one for both simulated and real data analysis when a B-spline surface fitting was performed. Additionally, a pre-gridding of the raw PC for a real scenario affected the distance computation positively by further reducing the observations available. • Rigorous statistical test for deformation can only be performed based on parametric surfaces. That is one of the most significant advantages of mathematical approximation. Because the distribution of the test statistics for deformation based on the AHD is not tractable: we proposed and validated a novel bootstrap approach for the test decision. • Correlated noise affects the distance computation between PC for both raw and approximated observations. In the case of an approximation of the PC with regression B-spline surfaces, an optimal stochastic model in the LS adjustment is mandatory to reach the optimal value of the distance: both mathematical and temporal correlations should be accounted for.
In a real application, the impact of the noise on the distance can be decreased by optimal gridding of the raw observations, similar to the account of correlations. Consequently, a calibration using a highly accurate sensor could be performed in advance. The size of the cell depends on the point density within the surface under consideration. Further analysis will be performed to fix the optimal grid size by means of calibration. We will also validate the proposed correlation model by analyzing the residuals of the B-spline surface approximation.
Author Contributions: G.K.: conceptualization, methodology, investigation, analysis and writing, H.A. and B.K.: conceptualization, methodology (bootstrap simulations), review. All authors have read and agreed to the published version of the manuscript.
Funding: The publication of this article was funded by the Open Access fund of Leibniz Universität Hannover.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A The Matérn Model
In this contribution, we use the Matérn covariance function [47] to compute Σ TC . The main parameters of this model-which can be extended to account for anisotropy and nonstationarity [50]-are briefly presented here.
In its simplest and spatial form, the Matérn covariance function C matern is defined by C matern (r) = φ(αr) ν K ν (αr), where r > 0 is the Euclidean distance between two points in space and ν is the smoothness parameter related to the mean-squared differentiability of the field at the origin. K ν denotes the modified Bessel function of the second kind with order ν and α is a range parameter that controls how quickly C matern decreases as r increases. The function is usually normalized to 1 with the scaling parameter φ and can be easily scaled to any other variance, for example, using the intensity model of Reference [12] as proposed in this contribution. In this contribution, the r is replaced by the time ti to obtain a temporal covariance function. Figure A1 left displays C matern scaled to 1 (i.e., the correlation function) for different choices of the shape parameter ν. The parameter clearly specifies the rate of decay of the covariance function at the origin and is, thus, related to the high-frequency content in the spectral domain. When Toeplitz VCM are built with this covariance function, their inverse will become more fully populated as ν increases and, consecutively, the impact of accounting for correlations in the LS adjustment will be stronger [16]. The following cases are well-known for a 1D field and should be mentioned: • ν = 1 2 corresponds to the exponential covariance function, that is, a strong decay at the origin • ν = 1 to the Markov process of first order • ν = ∞ is the squared exponential covariance function, which corresponds to a physically less plausible infinitely differentiable random field at the origin. The case ν = 4 of Figure A1 left highlights the meaning of this assumption, that is, a low decaying C matern at the origin, leading potentially to some numerical problems when corresponding VCM have to be inverted. Figure A1 right shows the different correlation functions obtained by varying the range parameter α. As can be seen, α is linked with the speed at which the covariance function decays to 0. Please note that other parametrizations of the Matérn function are possible [38]. The parameters, including the variance, can be estimated with the maximum likelihood or cross-validation methods, eventually by fixing one parameter to reduce the computational burden [54].
Remote Sens. 2020, 12, x FOR PEER REVIEW 20 of 26 and  is the smoothness parameter related to the mean-squared differentiability of the field at the origin. K  denotes the modified Bessel function of the second kind with order  and  is a range parameter that controls how quickly decreases as r increases. The function is usually normalized to 1 with the scaling parameter  and can be easily scaled to any other variance, for example, using the intensity model of Reference [12] as proposed in this contribution. In this contribution, the r is replaced by the time ti to obtain a temporal covariance function. Figure A1 left displays scaled to 1 (i.e., the correlation function) for different choices of the shape parameter . The parameter clearly specifies the rate of decay of the covariance function at the origin and is, thus, related to the high-frequency content in the spectral domain. When Toeplitz VCM are built with this covariance function, their inverse will become more fully populated as  increases and, consecutively, the impact of accounting for correlations in the LS adjustment will be stronger [16]. The following cases are well-known for a 1D field and should be mentioned: corresponds to the exponential covariance function, that is, a strong decay at the origin • 1  = to the Markov process of first order •  =  is the squared exponential covariance function, which corresponds to a physically less plausible infinitely differentiable random field at the origin. The case 4  = of Figure 4 left highlights the meaning of this assumption, that is, a low decaying matern C at the origin, leading potentially to some numerical problems when corresponding VCM have to be inverted. Figure A1 right shows the different correlation functions obtained by varying the range parameter  . As can be seen, is linked with the speed at which the covariance function decays to 0. Please note that other parametrizations of the Matérn function are possible [39]. The parameters, including the variance, can be estimated with the maximum likelihood or cross-validation methods, eventually by fixing one parameter to reduce the computational burden [55].

Appendix 2: Bootstrap Statistical Test for Deformation
A parametric surface modelization allows the significance of the deformation magnitude to be statistically and rigorously assessed. In order to test the significance of the HD and AHD, we define the null and the alternative hypothesis of the test by: H 0 : E{D s } = 0 vs. H 1 : E{D s } 0 and H 0 : E D s_ave = 0 vs. H 1 : E D s_ave 0, that is, the null hypothesis states that no deformation happened. E{} is the expectation operator. As we aim to measure the deviations from H 0 as a distance measure, we follow [42] and choose the HD test statistic This a priori test statistic is similar to the congruency statistic used, for example, to test deformation in geodetic networks [4]. It is directly derived from Reference [46]. In this latter contribution, gridded surface points were used to compute the test statistic. The proposed test statistic (9) is more general, as the points of the two surfaces under consideration are the ones where the HD occurs. As has been mentioned previously, the HD distance defined by (7) is based on the closest distance between two points at different epochs, so that the point i HD may differ from j HD .
We define Σ ∆∆ as the VCM of the estimated surface differences. Using the error propagation law and neglecting cross-correlations, we have Σ ∆∆ = A T 1 Σ −1 i HD at the HD points on the two surfaces, whereas Σ noise,1 , Σ noise,2 are the VCM specified in Section 3.
T AHD is defined similarly as the mean of the weighted sum of the square of the surface difference vector for each set of corresponding (i.e., closest) points on the two surfaces.