On Estimating the Hurst Parameter from Least-Squares Residuals. Case Study: Correlated Terrestrial Laser Scanner Range Noise

: Many signals appear fractal and have self-similarity over a large range of their power spectral densities. They can be described by so-called Hermite processes, among which the ﬁrst order one is called fractional Brownian motion (fBm), and has a wide range of applications. The fractional Gaussian noise (fGn) series is the successive di ﬀ erences between elements of a fBm series; they are stationary and completely characterized by two parameters: the variance, and the Hurst coe ﬃ cient (H). From physical considerations, the fGn could be used to model the noise of observations coming from sensors working with, e.g., phase di ﬀ erences: due to the high recording rate, temporal correlations are expected to have long range dependency (LRD), decaying hyperbolically rather than exponentially. For the rigorous testing of deformations detected with terrestrial laser scanners (TLS), the correct determination of the correlation structure of the observations is mandatory. In this study, we show that the residuals from surface approximations with regression B-splines from simulated TLS data allow the estimation of the Hurst parameter of a known correlated input noise. We derive a simple procedure to ﬁlter the residuals in the presence of additional white noise or low frequencies. Our methodology can be applied to any kind of residuals, where the presence of additional noise and / or biases due to short samples or inaccurate functional modeling make the estimation of the Hurst coe ﬃ cient with usual methods, such as maximum likelihood estimators, imprecise. We demonstrate the feasibility of our proposal with real observations from a white plate scanned by a TLS.


Introduction
Terrestrial laser scanners (TLS) capture a large amount of 3D points rapidly, with high precision and spatial resolution [1]. These scanners are used for applications as diverse as modeling architectural and engineering structures, and high-resolution mapping of terrain, vegetation, and other landscape features. The recorded point clouds can be processed and analyzed with dedicated software. In engineering geodesy, this processing allows for the computation of deformation magnitudes. Unfortunately, these latter are negatively affected when noisy and scattered point clouds (PC) are used. Additionally, no rigorous statistical test for deformation can be performed with the raw PC [2].
These drawbacks can be circumvented by approximating the PC with mathematical surfaces [3]. Besides norms such as L1 or L∞ [4], a widely used criterion is the sum of squares of the orthogonal distances from the data points to the parametric surface. Exemplarily, regression B-spline enjoys special attention to approximate point clouds from TLS: B-splines basis functions have a closed form expression, are polynomial, and, thus, particularly easy to compute (see [5] for one of the first articles related to that topic in geodesy). The setup of specific statistical tests with confidence intervals is based on the estimated parameters, or on the approximated surface points. Exemplarily, the congruence test can be used to test for deformation ( [2]) and is known to be the most powerful test in Gauss-Markov models with normally distributed random deviations and a correctly specified stochastic model. The setting of a realistic variance covariance matrix (VCM) of the raw observations of the TLS is done prior to this test [6].
As for every sensor recording millions of points in a few minutes, the measurements of TLS are expected to be temporally correlated. Physically, the range or distance measurements are phase differences, so that a power law spectral density of the correlated range noise is hardly plausible ( [7], [8]). This correlation structure was empirically proven in a few recent real case analyses, see, e.g., [9] or [10]. The authors approximated simple scanned objects with a Gauss-Helmert model [11], assuming pre-defined geometric primitives such as circle, ellipsoid, and plane. The correlation parameters were estimated by fitting the residuals of the approximation with an autoregressive function of the first order (AR (1)). This methodology has, however, drawbacks: (i) the exponential covariance function restricts the description of the correlation structure to short range dependency and may not be a physically adequate modelization, (ii) empirically the covariance can be problematic. Two autocovariance functions sharing a common principal irregular term won't yield asymptotically the same best linear predictor (see [12], chapter 3) for some examples. Using the popular Gaussian function can lead to overoptimistic predictions. Additionally, antipersistent data are difficult to distinguish from uncorrelated data and the correlations can be mistaken for noise fluctuations around zero, (iii) the methodology could be made more general: it is based on a calibrated object scanned in a controlled environment.
In this contribution, we propose to address these drawbacks and to derive a general methodology to assess the correlation structure of the TLS range measurements. We will base ourselves on the physical expectation that the TLS range noise should have a long-range dependency (LRD) and heavy tailed distribution. Our proposal to extract the correlation structure is applicable to every kind of object, without being restricted to predefined objects or calibration scenarios. It is extendable to other kinds of observations, such as residuals from a geodetic coordinate time series [13]. We choose to model the noise of TLS range with a stationary LRD noise: the fractional Gaussian noise (fGn), which is entirely defined by the Hurst parameter (abbreviated by H) and the variance. It has the main advantage that the autocorrelation function can be easily estimated without computation burden [14]. Fractal time series or signals such as the fGn have been found in many domains, including biology [15], medicine (EEG [16,17]), finance (stock market analysis [18]), geology, and traffic analysis [19]. Various statistical techniques have been proposed to estimate H and each has shortcomings and advantages: they may perform better in the presence of noise, for short samples, or for H close to a given value, may have slow convergence, etc. (see, e.g., [20][21][22][23][24]). There exist three families of estimation methods: the time domain (e.g., Rescaled Range R/S estimator [25], the detrended fluctuation analysis method [26]), the frequency domain (periodogram [27], the Whittle estimator [28]), and the wavelet space [29], which was shown to provide an unbiased, efficient, and robust estimator.
We will use the residuals of the B-splines approximation of the TLS point clouds to assess the correlation structure of TLS range. We conjecture that although (i) additional white noise and (ii) possible model misspecification could introduce additional frequencies in the residuals, these latter still contain enough information to estimate H, provided that an adequate filtering is performed. Besides the simulated observations from a TLS, we will evaluate our methodology and compare the performance of three different estimators for H using real observations from a white plane.
We firstly disregard the correlations of the polar angles; a similar methodology as presented in this present study could be used to that aim.
The remainder of this paper is structured as follows: the first section provides a brief summary of the mathematical concepts of least-squares and stochastic modeling. The second section introduces the concepts of fGn, Hurst exponent, and filtering. The third section describes the results of simulations for two specific cases: a plane and a Gaussian curve. We conclude with a real case study and some recommendations.

Functional Model
Free-form curves and surface fittings are flexible tools to approximate PC without being restricted by the use of geometric primitives, such as circle, planes, or cylinders. Possible applications of surface approximation include the testing of deformation [30] or the reduction of a huge amount of points to a simpler form.
In this contribution, we make use of B-spline surfaces. Their properties and advantages over other functions, such as control and flexibility, are exemplarily described in [31]. Readers interested in more details on how spline fitting works should refer, e.g., to [32,33], and more specifically for geodetic applications to [5] or [31]. Such surfaces satisfy the strong convex hull property and have a fine and local shape control so that they were shown to be adequate for approximating noisy and scattered PC (see, e.g., [34]). For the sake of shortness, we shortly introduce the main concept, focusing on least-squares (LS) approach to determine the model parameters, called control points (CP).
We start with n obs polar observations from a TLS expressed in vector form l POLAR of size (3n obs ). The observations are made of two angles, HA and VA, and one range ρ to which a VCM Σ ll_POLAR is associated. This matrix describes the variance and possible correlation between the observations [11] and is focus of our contribution.

First Step: From Polar to Cartesian
The first step of the approximation is the transformation of the PC coordinates vector from polar l POLAR into Cartesian l CART .
The VCM has to be transformed by the error propagation law. The VCM Σ ll_CART reads: The matrix F contains the derivatives of the point coordinates with respect to the range and angles and is given for one point i by:

Second Step: The Approximation
The Cartesian PC can be approximated mathematically by means of a linear combination of basis functions, such as B-splines. In its parametric formulation, the B-spline surface s(t, f ) is a tensor product surface and can be expressed as 1] are the parameters in the two directions so that a B-spline surface maps the unit square to a rectangular surface patch. The basis function N i,p and N j,q are composite curves of degree p and q polynomials, respectively, with joining points at knots in the interval u i , u i+p+1 and v j , v j+q+1 . They can be evaluated by means of a recurrence relationship [32]. To summarize, the surface is defined by: • a set of n + 1 CP in the direction t and m + 1 CP in the direction f , • a knot vector of h + 1 knots in the t-direction, U = [u 0 , . . . , u h ], • a knot vector of k + 1 knots in the f -direction V = [v 0 , . . . , v k ], • the degree p b of the basis functions in the t-direction, and the degree q b in the f -direction.
In this contribution, we will take a degree of p b = q b = 3 for the B-splines functions (cubic B-splines). We solve the determination of an optimal knot vector using the knot placement technique as described in [33]. The Cartesian point cloud is parametrized in advance with the equidistant parametrization, justified by the simple structure of the objects under consideration in this contribution.

Third Step: LS Solution
Approximating a PC with a B-spline surface is finding the coordinates of the CP so that the distance of the data points to the approximated surface is minimized. This step can be performed by solving the LS problem, for which the minimum in the LS sense of the zero-mean error term v is searched: min p is the matrix of CP to be estimated and is of size (3(n + 1)(m + 1)), A −(3n obs , 3(n + 1)(m + 1))is called the design or mass matrix. It contains the evaluation of the B-spline functions at the parameters. Interested readers should refer to [35] for the description of the design matrix.
The estimated coordinates of the control points are expressed by the unbiased generalized LS estimator (GLSE [11]):p If the VCM Σ ll_CART is the identity matrix (equal variance for all coordinates), the ordinary LS estimator (OLSE) is obtained: We further note that the LS estimator is unbiased E p GLSE = E p OLSE = p so thatp can be computed either with the OLSE and GLSE solution; thanks to the unbiasedness of the LS estimator, the expectation E of the estimated coordinates of the CP are not affected by the choice of Σ ll_CART . However, the OLSE is not the most efficient within the class of linear unbiased estimators anymore when Σ ll_CART deviates from the true (and unknown) VCM. Consecutively, hypothesis tests such as the global test, outlier tests, or congruence tests become invalid [36]. It is one of the main reasons why assessing the correlation structure of the raw measurements is an actual research topic.
The number of control points has an impact on the LS solution on the fitted surface. It can be either fixed a priori or iteratively adjusted in the context of model selection [37]. As the impact of model misspecifications is interesting for our purpose, we will make use of the first strategy.
with I being the identity matrix. We further defined the a posteriori variance factor aŝ This factor can be used to judge the goodness of fit of the LS adjustment by means of a global test [36]. The a priori VCM of the estimates is given by

The Polar Residuals
In this contribution, we propose to extract the correlation structure of the TLS range from the LS residuals of the B-spline approximation. We answer the drawback raised in the introduction by being independent of calibrated objects, i.e., our methodology should be applicable in every environment.
As mentioned in Section 2.2.1, the LS adjustment gives access to the Cartesian residuals. To assess the noise of the raw TLS observations (range), we transform the Cartesian residuals into polar v POLAR = v HA ,v VA ,v r . These latter have a VCM Σv POLARvPOLAR obtained similarly to Equation (1), using the error propagation law "backwards". This matrix depends on the original matrix Σ ll_POLAR .
Our assumption is that we can derive the correlation structure of the original observations fromv POLAR .
We will further focus onv r and conjecture that this final vector still contains enough information to give us access to the approximate correlation structure of Σ ll_r , defined as the VCM of the range observations.

Variance
Raw observations from TLS are the three polar coordinates of the recorded points. They are made of a range and two angles in the vertical and horizontal direction. These observations are known to have different noise properties ( [38][39][40]): the noise of angles is widely assumed to be Gaussian with a variance taken from manufacturer datasheets. The noise of the range measurements has a slightly different structure. Its variance can be considered as a constant; the manufacturer datasheets provide different values depending on, e.g., the approximated distance to the scanned object and/or to the properties of the surface (roughness, color). Alternatively, the variance can be modeled as following a point-wise power law intensity model [41,42].
In this contribution, we simulate different point clouds with a noise variance close to what is expected in a real case experiment: , where σ HA = σ VA are the standard deviations for the HA and VA and σ ρ for the range, respectively. We intentionally chose a case where range and angle have different variances to simulate a more general scenario.

Correlation Structure for Range Measurements
In a first approximation, the range measurements are considered to be uncorrelated, i.e., one observation recorded at time t is not dependent on the observation recorded at t + τ, τ being the interval between two measurements. τ is also called time lag; it is related to the scanning rate of the observations and depends on the setting. Exemplarily, the resolution for a TLS Z+F 5016 can be varied from preview to ultrahigh up to extremely high, and low to high quality: these choices impact the scanning rate, and thus the scanning time of an object. The assumption of uncorrelatedness is overoptimistic: range measurements are based on phase differences, which are inherently influenced by, e.g., the propagation of the signal through a random media, but also by the point spacing on the surface.
In this contribution, the correlation structure of the range will be modeled as a fractional Gaussian noise (fGn). This assumption is justified by the physically based expectation of the author that the range noise is stationary and that its power spectrum will follow a power law [7]. The validation of this model with real data is shortly shown in Section 4.5. More extensive works using TLS observations will be performed in a next step based on the proposed methodology.
fGn has the beneficial properties that it is characterized by its variance and a single parameter called the Hurst exponent H. We will here shortly introduce the concept of fGn; interested readers can refer to [14] for more information.

What is a fGn?
To define a fGn, the understanding of the LRD concept is mandatory. This property of a process is linked with the slow decay of the autocorrelation ρ to zero so it is a non-summable function, i.e., if the average value of its partial sums does not converge, see [43]. More precisely, with τ the time lag, c a positive constant, and 0 < δ < 1. As τ increases, the dependence between the observations stays strong, which implies a fat tailed autocorrelation function. Exemplarily, for a stationary process δ = 0.3, the autocorrelation for lag 100 will stay at 0.15, whereas for a Markovian process, the autocorrelation would be practically zero for lags 10 times less. This important property is the reason why such processes are said to have a "long-term memory" and it is one of the major reasons why we wish to model the correlation structure of TLS range observations with such a process. Intuitively, the high rate of measurements induces a long dependency between the observations: the autocorrelation may decay quickly at the origin-e.g., between the first and the second observation-but stays for a long dependency much higher than 0. The autocorrelation will be similar between the first observation and the 100th or the 200th. For a stationary process, the LRD can be related to a parameter called the Hurst exponent H, defined as a measure self-similarity. A stochastic process X H (t) is self-similar if X H (t) has the same distribution as λ −H X H (λt), where λ is a scale parameter. Concretely, the process will appear statistically identical under rescaling of the time axis by a given factor and X H (t)∞λ H ; it lacks any characteristic time scale. This characteristic allows interpretation of H as a measure of the strength of dependence between the time points, or more loosely, how much space the signal "fills in".
A self-similar process with stationary increment X H (t + 1) − X H (t) has an autocorrelation C H (τ) given by so that for τ → ∞ , C(τ) → H(2H − 1)τ 2H−2 , meaning that the process has a long-range dependency, see Equation (6) and [44]. From these definitions, one can define the Hermite process of first order called the fractional Brownian Motion (fBm, [14]) as a generalization of a Brownian motion for which H = 1 2 . It is a non-stationary process with stationary increments and possesses the long-term memory, also called persistency or positive correlations when H > 1 2 . When H < 1 2 , the process has short term memory, or similarly anti-persistency or negative correlations; the autocorrelation decays fast enough so that their sums converge to a finite value. Both processes are described by a fractal dimension D, which is related to the Hurst exponent by D = 2 − H for a fBm [44].
Successive increments ς H of a fBm are called fGn: A fGn is, thus, a zero mean stationary process, defined as the stationary increment of fBm. The fGn is fully characterized by the Hurst exponent and the variance σ 2 ς H . The corresponding distribution is completely specified by its autocovariance function given by Equation (7).
H can be related to the power-law spectrum P( f )∞ 1 f β , with f , β being the frequency and the power law of the process, respectively. Exemplarily β = 0 corresponds to a white noise, β = 1 is a pink noise and β = 2 is the Brownian noise. For a fBm, H is related to β by H = β−1 2 , with 1 < β < 3 and for a fGn, H = β+1 2 with −1 < β < 1 [45]. Using real observations, it is important to check if the noise is fGn or fBm: using the Matlab built-in function to estimate the Hurst exponent can lead to a misinterpretation of the results when not accounted for.
The difference between a fBm and a fGn can be visualized in Figure 1, where fBm ( Figure 1, right top) and fGn ( Figure 1, left top) versus time with different H are simulated. They are given with their corresponding power spectral densities (PSD), which decay linearly in a logplot (Figure 1, right bottom). We visualize the aforementioned "fills in" property of the process, i.e., small H"fills in" significantly more space than H = 0.9, which is related with a higher fractal dimension D.

Generation of fGn
In this study, we are focusing on fractal stationary noise, i.e., fGn. From the previous section, and per the definition of the fGn, it can be generated by differentiating a fBm. Matlab (2018) provides a function called wfbm, which returns a fBm for a given Hurst parameter. This function uses the wavelet method from [46], which may bias the estimation of the Hurst parameter towards the wavelet method (described in the following section).
Alternatively, we propose to use the function called ffGn, a freely available function in the Matlab file exchange section. The ffGn function has the main advantage of being based on the circulant embedding method for persistent noise, resulting in a reproduction of its exact autocovariance [47]. To test the function, we assessed the standard deviation with which the Hurst parameter can be reproduced. We generated 10,000 realizations of short time series of lengths (1) 400 and (2) 1000. Focusing in this contribution on persistent fGn, the Hurst parameter was varied in the range of [1/2-1] by steps of 0.05. H was estimated using the three methods presented in the following sections. The standard deviation was found for all three methods to be between 0.01 for case (1) and 0.005 for case (2), highlighting the good performance and stability of the chosen function for noise generation.

How to Estimate the Hurst Parameter
Many methods have been proposed to estimate the Hurst exponent. They can be classified in three families: estimation in the time domain, frequency domain and wavelet domain. Intuitively, whereas the first family investigates the power-law relationship between a specific statistical property of the time-series and a time aggregation of it, the two latter examine if the spectrum or energy of the time-series follows power-law behavior. Inside these three families, different estimators have been proposed and constantly improved (see [48] and the references inside). We do not aim to review them all, which is exemplarily done in [49].
In this contribution, we focus on three estimators, which belong to each class: the generalized Hurst estimator, the Whittle estimator, and the wavelet estimator. The Whittle estimator was chosen for its capability to perform well when the number of observations is reduced; this case can occur when the TLS scanned lines are short, due to, e.g., the measurement configuration and the scanned object. The selection of the generalized Hurst estimator is justified by the fact that the noise of the angle or functional misspecifications may affect the power spectral density of the B-spline residuals at low frequency; this estimator focuses on the middle-high frequency part of the PSD and can be adapted with additional parameters. The wavelet estimator is said to be the less biased Hurst exponent estimator provided that a huge amount of observations is available (asymptotic behavior).
These challenges of estimating the correlation structure accurately are similar to the ones of the geodetic coordinate time series analysis (see [13] for further references on that topic). The chosen estimators have to account for these specificities. In the following, we will shortly review the three methods under consideration. A good understanding of their properties is mandatory to derive a meaningful methodology to extract an unknown H from B-spline residuals.

Generalized Hurst Exponent (GHE)
The generalized Hurst exponent was introduced in [50] and used for finance market analysis in [51] and [52]. It is a generalization of the approach proposed by [25].
The generalized Hurst exponent measures the LRD in the time domain. It is evaluated by using the qth-order moments of the distribution of increments: τ is varied between 1 and τ max (usually taken to 20). Acting on τ max allows accounting for the specificity of the observations-for example, to force the estimator to focus on high frequencies-as described in [24]. · stays for the average operator. H q is related to K q (t): K q (τ)∞τ qH q , which allows the computation of as an average over a set of values corresponding to different τ. If H q is not constant by varying q, the process is called multifractal, whereas H q = H characterizes an monofractal process [53]. In this contribution, and because we are not interested in the behavior of financial time series to predict the evolution of specific markets, we only estimate H 1 , which describes the scaling behavior of the absolute values of the increments. H 1 reaches the value 1 2 for a Gaussian noise. H 2 would correspond to an estimation in the frequency domain.

Whittle Likelihood Estimator (WhiE) Method
The Maximum Likelihood Estimator (MLE) is not a graphical method but is a purely numerical one. Thus, more than just the asymptotic self-similarity is assumed [53]; the MLE requires at least an assumption about the form of the LRD (such as a noise coming from fBm or Autoregressive integrated moving average ARIMA). If this assumption holds, it is often considered to be the best obtainable estimator; the estimates are asymptotically unbiased, and the estimator is asymptotically efficient and fast to compute. Unfortunately, MLE performs poorly if the assumption is incorrect or for short samples [54]. Exact maximum likelihood inference can be performed for Gaussian data ( [55]) by evaluation the log-likelihood where X H denotes the column vector of length n of observations and C H is a fully populated VCM, which components are given using Equation (7). |C H | is here the determinant of the matrix. By maximizing the likelihood function, one obtains an optimal choice for H:Ĥ = argmax(l(H)), with 0 < H < 1. To approximate Equation (10), matrix inversions are necessary. They can be avoided using the Whittle estimator [28], which aims to provide faster estimation with only a slight inaccuracy. In that case, the Whittle likelihood in its discretized form is given by with Ω the set of discrete Fourier frequencies, f (ω, H) the continuous-time process spectral density and I(ω) the periodogram I(ω)∞ N j=1 X H,j e −ijω 2 . The same notation as in [54] was adopted.
Whittle estimator assumes a priori that the power spectrum of the underlying process of the dataset is known. Moreover, to be applicable to fGn, the mean of the time series has to be subtracted beforehand [56]. As aforementioned, the Whittle estimator should only be used if a time-series has already been shown by other methods to be consistent with a specific process, e.g., a fGn. Thus, it is not an adequate method to detect LRD.

Wavelet Estimator (WE)
Since H describes the level of statistical self-similarity of a time series or spatial process, the exponent can be found by averaging squared values of the wavelet coefficients where d X ( j, k) are the detailing coefficients defined as with J, the number of decomposition level and approx the approximating component-not of interest for our purpose. E j at scale j can be shown to obey the scaling law: The Hurst exponent is obtained by fitting a line to the linear part of log 2 E j versus j in order to obtain the slope α. Differently to the power spectrum method, E j contains here the information about the power carried at each time scale j. It was found to be robust even if the LRD is not equivocal [57] but performs poorly for short sample. Similarly to the power law β, α is linked to H differently for a fBm and fGn, with H = α−1 2 , H = α+1 2 , respectively. Wavelet based estimator are implemented in Matlab under wfbmesti. The values are based on the estimation of the Hurst exponent for a fBm and have, thus, to be applied to the cumulative sum of a fGn.

Additional Remarks
Periodicity and noise in the time series biased strongly the identification of LRD; the estimators are misleading and can detect LRD erroneously, or on the contrary find a Gaussian noise with H = 0.5 [20]. Frequency or wavelet-based estimators depend strongly on short-memory and necessitates strategies to alleviate these effects. The estimators have to be enabled to focus on the long-range correlation in case of additional Gaussian noise of unknown variance. One possible way to face this challenging situation will be proposed in this contribution; the filtering of the noise with a low pass Butterworth filter. Detailed simulations in Section 4 will explain the reasons of this choice.

Butterworth Filter
Butterworth filters can be designed as bandpass, lowpass, or high pass filters. They are called maximally flat filters as for a given order they have the sharpest roll-off possible without inducing peaking in the Bode plot. The Bode plot is a log-log graph where the gain in decibels is plotted against the logarithm of the angular frequency. An example is shown exemplarily in Figure 2 for different order of the Butterworth filter. We note that the Butterworth filter changes from pass band to stop-band by achieving pass band flatness. This is done at the expense of wide transition bands. This property, sometimes considered as the main disadvantage of Butterworth filter, turned out to be the main reason for using such a filter for our application. A great flexibility is given in locating the cutoff frequency, i.e., the values of the elements of the Butterworth filter are more practical and less critical than many other filter types. Interested readers should refer to [58] or [59].

Simulations and Results
In this section, we will combine all the mathematical developments presented in the previous sections: surface fitting and Hurst exponent estimation. We recall that our aim is to estimate the Hurst exponent of the range measurements from the residuals of a B-spline approximation. In order to work in a controlled framework, we use in a first step simulated TLS observations. A short real data analysis highlights the potential of the proposed methodology, which will be pursued in further dedicated contributions.

Simulation of TLS Observations
The first step towards analyzing the correlation structure of the range residuals as described in Section 2 is to simulate TLS observations. In this contribution, we choose two different surfaces with increasing complexity: a plane and a Gaussian surface.
The plane has the equation z = −3x + 15y + 7. The distance between the origin of the coordinates and the centre of the plane is 7 m. The coefficients of the plane were chosen without any search for optimal scanning condition in order to test our methodology in complex cases. The representation of the plane is shown in Figure 3 (left bottom). The TLS is placed at the origin of the axes, see Figure 3 (right). The Gaussian surface has the equation z = 1 2π5 2 e − 1 2 ( x 2 5 2 + y 2 5 2 ) and is shown in Figure 3 (left top). For each surface, the PC were generated by varying x ∈ −1 1 , y ∈ −1 1 . Two samplings were chosen: case (i) 400 observations and case (ii) 1000 observations per scanning line, resulting in PC of size 400*400 and 1000*1000, respectively. These cases are chosen to study the impact of the density of the PC on the estimation of the Hurst parameter.

Noise Simulation
The simulated Cartesian coordinates were backwards transformed into polar coordinates [VA, HA, r] and noise component wise: • to the vertical and horizontal angles is added a Gaussian noise with a standard deviation of 0.0001 • generated with the Matlab function randn, • to the range r is added a fGn noise with a standard deviation of 0.005 m. We generated noise vectors with three different Hurst exponents: 0.6 (nearly Gaussian), 0.7, and 0.9 (strong LRD).

Line Wise Noise
We did not generate one noise vector for the whole observations. Instead, we added to each scanning line an independent noise vector; see Figure 4 for an illustration of the chosen strategy. We generated as many noise vectors as scanning lines, which size depend on the chosen sampling (case (i) or (ii)). Figure 4. Explanation of the concept of line wise noise. One noise vector is added to each line for a constant x independently. In this example, from t i = 1 to t i = 10 is added one noise vector. A new one is added starting from t i = 11. The same procedure is repeated for as many lines as the point clouds contain.
Thus, the noise is not added as a whole to the observations. We justify this line wise strategy by the fact that TLS observations are recorded in such a way that the elapsed time between the last observation of one line and the first of the following line is much longer than the time between two observations inside one line. Using this modelization, we are able to place ourselves in the context of a more general and potentially time varying correlation structure during the scanning process, answering the challenge (iii) mentioned in the introduction. This effect can be caused, e.g., by changing object properties or atmospheric conditions.

Estimation of the Hurst Exponent from the Residuals
In the following, we will compare the estimates of H with the three previously described estimators. An application to a real case scenario is presented in Section 4.5, as well as a generalization of the results in Section 4.6.
We start with the approximation of the plane (Section 4.1, Figure 3, left bottom). We approximate the PC with a cubic B-spline surface and fix the numbers of CP to estimate to 4, which is justified by the simple geometry of the simulated object [60]. Intentionally, we are not searching from an optimal functional model, which could be based on an iterative method using information criteria [37]. We take a reference value of H re f = 0.6 for the simulated noise. This reference value is close to H = 0.5 (white noise) and is challenging to estimate accurately.
Interestingly, this scenario is difficult to fit with B-splines due to the unfavorable orientation of the plane in space; it leads to a so-called strong "border effect" in the B-spline surface approximation, which is not solved entirely by a higher knot multiplicity. This effect can be visually seen in the plot of the residuals by a strong increase of the variance at the beginning of each line (Figure 5 left). The correlation structure of the residuals is not dependent on the stationarity (or not) of the residual's variance. Consequently, we allowed ourselves to disregard the corresponding inaccurately approximated first epochs of each lines; exactly the same results as those presented in the next section were found. We interpret this effect as being due to the self-similarity property of the noise and, thus, we did not intend to suppress it.

Impact of Model Misspecification: No Noise Angle
In order to highlight the impact of both the model misspecification and the angle noise on the estimation, we firstly noised the simulated range only. On the contrary to real data analysis, the simulation framework allows for this flexibility.
The obtained range residuals of the adjustment are shown in Figure 5. They are plotted Although the whole residuals may visually appear as white noise with a slight variance increase in the between t = 4e5 and t = 7e5 (unit of time) due to the scanning configuration (Figure 3, left top), we identify repetitive pattern for each approximated scanned line (Figure 3, left middle); they are likely to influence negatively the estimators of the Hurst parameter that are acting in the frequency domain such as the WhiE.
The PSD for the three cases (1-3) corresponding to Figure 5 (left) are plotted in Figure 5 (right). We note that it has strong similarities with the one of the simulated noise for the whole range residuals. The expected power law decay is kept nearly intact, which is beneficial to the wavelet and WhiE estimator. Additional frequencies for −2.5 < log 10 ( f ) < 1 are visible, which we link with the aforementioned repetitive pattern due to model misspecifications. For decreasing sample size ( Figure 5, right bottom), the low frequency domain of the analyzed residuals from the first scanning line does not follow exactly the one of the original noise. It is possible to compensate for that effect using the GHE by decreasing τ max ; this corresponds to down-weighting the impact of the low frequencies in the estimation of H.
For the sake of convenience and without lack of generality, we will carry our explanation with the residuals of the first line.
We computed the Hurst exponent for case 1 and 3 with the chosen three methods. Case 2 is of minor interest and was only presented to show the pattern of the residuals. For case 1, the whole residuals are considered in the estimation of H, leading in a longer time series, whereas for case 3 we take the mean of the H estimated over smaller samples. For case 3, the standard deviation as well as the min/max values of H can be given (one Hurst exponent is estimated for each line). The results are presented in Table 1. We added the estimation of the Hurst exponent from the original generated noise for comparison purpose.  Table 1 shows that from the three estimators, the WhiE performs worst. This holds true particularly for case 3, for which the Hurst exponent for both the simulated noise and the residuals are overestimated; this effect was expected due to the small samples under consideration (1000 observations) and is related in the literature as the main drawback of this estimator-under the assumption that the noise is fGn. For case 1 (whole residuals), the WhiE has a better performance regarding case 3 due to the frequency averaging but remains a poorer estimate compared with the values given by the GHE and the WE; both estimators provide the true H. The GHE is less affected by the sample size than the wavelet estimator, i.e., in case 3 the standard deviation of H for the WE is higher than for GHE for both the noise and the residuals.
From this simulation and without additional noise on the angles, the preference goes towards the GHE when the H exponent has to be evaluated for each line, i.e., for small samples. This is a nice result when temporal variations of the parameter are expected ( [61]), since they will be detected with a higher trustworthiness than with other estimators.
Similar results are obtained for H re f = 0.7 and H re f = 0.9, and are not presented for the sake of shortness and readability of this article.

Impact of Model Misspecification and Noise Angle
In a second step, we added a noise with a standard deviation of 1×10 −4• to the angle components. In order to be able to visually identify the difference between the slope of the PSD for the noise to the one of the residuals-affected by additional white noise-we consider the case H re f = 0.9 (see Figure 6). This is a challenging exponent to estimate, since the corresponding process is close to a flicker noise. Results for other H are similar when the same methodology is applied. From Figure 6, the impact of the additional noise coming from the angles and propagating in the range residuals can be clearly identified in the high frequency domain, i.e., from log 10 ( f ) >−0.6 for case 1, and log 10 ( f ) >−0.4 for case 3. This corresponds to a noise at −40 dB/Hz for case 1, −45 dB/Hz for case 2, and −57 dB/Hz for case 3, approximately.
The corresponding Hurst exponents were estimated and the results are presented in Table 2. As previously, we also give the results obtained for the simulated noise. The first remark to draw from Table 2 is the stronger difficulty to estimate the Hurst parameter from the true fGn for small samples (case 3) than for longer sample (case 1). The mean values are slightly below the true one of H re f = 0.9 for the GHE and WE estimators, with a higher standard deviation than in the previous case with H re f = 0.6. Clearly, the WhiE performs poorly and systematically underestimates the true parameter.
The second remark is the impossibility to extract the correct, or a value close to the correct Hurst exponent, independently of the case under consideration. The noise of the angles, as well as the noise induced by the fitting, leads to a strong underestimation of H close to 0.7. The decrease towards H = 1 2 (a white noise) is due to the increase of white noise in the signal. As previously, the WhiE estimates poorly H (0.53 for case 1). It is shown to be thus strongly affected by additional white noise on the residuals.
The analysis of the PSD (Figure 7) for case 1 and 3 highlights the impact of additional white noise. From Equation (9) and Equation (13), we notice that the GHE and WE need both low and high frequencies to perform the approximation of the Hurst exponent with trustworthiness; it seems advantageous to filter the high frequency noise of the residuals. In this contribution, we propose to apply a lowpass Butterworth filter of first order on the residuals from the cutoff frequency at which the PSD kicks towards white noise. This choice is not justified by empirical findings and we propose in the following to detail the reasons why we opted for the Butterworth filter.

Why a Butterworth Filter
A sharp low pass filtering would lead to an abrupt decrease of the PSD from the cutoff frequency of the filter; this effect is here unwanted as the estimation of H necessitates the whole range of frequencies, which would not be given any more. We prefer, thus, the "smooth and gentle" Butterworth filter of first order; it allows a continuous decrease of the high frequencies from the cutoff frequency. This leads to a filtering shown Figure 7 (green line), where the PSD of the filtered signal has the same decrease as the reference noise from log 10 ( f ) = −2 for case 1 and log 10 ( f ) = 0 for case 3. The filtering leads to an estimate of the Hurst exponent of 0.87 (std 0.01) and 0.89 (std 0.03) with the GHE for case 1 and 3, respectively (see Table 2). τ max was fixed to 20, as proposed in the literature [51]. Increasing τ max leads to a slight decrease of H of 0.2 for τ max = 40 with an increase of the std to 0.03, whereas τ max =5 is linked with an increase of H of 0.3 by a decrease of the std to 0.008. Thus, a balance has to be found to fix τ max optimally. A deep analysis of the PSD, i.e., the amount of power at low frequencies is necessary; whereas τ max can be used to filter unwanted low frequencies due to model misspecification, the cutoff frequency acts on the high frequency domain of the PSD.
With the chosen cutoff frequency, the WE overestimates H. Using a cutoff frequency of log 10 ( f c ) =−0.35, instead of log 10 ( f c ) =0.4, for case 3 yields H = 0.88. Similar results are obtained for case 1 with the WE by increasing log 10 ( f c ) to −0.55. Unfortunately, the GHE decreases to 0.83 in both cases. However, considering that (1) no prior knowledge of the Hurst exponent was available, (2) additional white noise affects the residuals, and (3) model misspecification are present, this remains a good approximation of the true H of 0.9. Indeed, this value of the Hurst parameter is known to be challenging to estimate since it is close to the limit between fGn and fBm.

Why First Order?
The answer is strongly related to the previous one: as shown in Figure 2 from the Bode plot, the flatness of the filter is of main importance to ensure smooth transition in the PSD of the filtered signal.

Sensitivity Analysis: Impact of the Cutoff Frequency
In this section, we propose to analyze the sensitivity of the estimated H exponent regarding the chosen cutoff frequency. Figure 8 summarizes the results for case 1 (top) and 3 (bottom) by varying log 10 ( f ) from to −0.7 to −0.25. With great evidence, the GHE is much less sensitive to the cutoff frequency than the WE. A linear dependency can be found, with a variation of H from 0.88 to 0.78 for the chosen range of cutoff frequencies and from 0.87 to 0.80 for case 1 and 3, respectively. H, estimated with WE, has a much higher range of values-from 1 to 0.68 and to 1.15 to 0.8 for case 1 and 3, respectively. From these results, and considering that we placed ourselves intentionally in a challenging estimating scenario with a strong Hurst exponent, we recommend using the GHE instead of the WE when the residuals are filtered and small samples are considered.

Small Samples
In this section, we place ourselves in case (i) as described in Section 4.1. and generate smaller samples of 400 observations per line. We chose three values for H: 0.6, 0.7, and 0.9, and apply our methodology to filter the residuals from additional white noise and/or results from model misspecifications. We identify the cutoff frequency f c visually by plotting (1) the PSD of the whole residuals for case 1 and (2) the PSD of 10 randomly chosen lines for case 3 and averaging the identified cutoff frequencies. The Hurst exponent is estimated with the GHE; due to their asymptotic properties, the WE and the WhiE are known to perform poorly for small samples [62].
For H re f = 0.6, the PSD is nearly similar to a white noise (see Figure 1). This leads to a stronger difficulty to identify the PSD kick. Nevertheless, we were able to identify with a high confidence the correct cutoff frequency and a value of H =0.60 could be estimated for case 1 for log 10 ( f c ) =−0.5. We link this result with the use of a Butterworth filter of the first order and the low sensitivity of the GHE to a misspecification of the cutoff frequency. The same cutoff was used for the two other simulated H re f . This result strongly confirms the feasibility of extracting the Hurst exponent from residuals of regression B-splines, in the presence of both functional misspecifications and additional unknown noise. The cutoff frequency depends on the noise angle variance, as illustrated in Table 3. Increasing the σ HA = σ VA to 0.001 • instead of 0.0001 • yields a different cutoff frequency. We intentionally do not present this result in order not to overload the readers with simulation results that lead to similar conclusions.

Result for Gaussian Surface
The second example corresponds to a simulated Gaussian surface (case (ii), Section 4.1.). Ten CP in the two directions were estimated with B-splines of order three. The stationary reference noise was simulated with H re f = 0.7 and 1000 observations per line. Similarly to the previous simulations, we do not aim to optimally fit the surfaces so that the impact of potential misspecification can be considered. In Figure 9 (left), the PSD of the residuals together with the PSD of the simulated noise are shown; Figure 9 (right) represents the residuals for case 1 and 3 respectively, following the previous section. This latter figure highlights the lack of repetitive patterns in the residuals plotted per line (case 3). Only a steady increase of the variance towards the middle of the surface can be seen, which is coherent with the Gaussian form of the surface (Figure 2 left bottom). This behavior does not affect the estimation of the Hurst parameter, which was 0.72 (std 1×10 −3 ) for case 1 and 0.71 (std 0.01) for case 3 with or without filtering. From the PSD, a low additional white noise from log 10 ( f c )= 0 could be identified, which did not affect the determination of H. We interpret this lack of additional white noise in the residuals as coming from the goodness of the surface approximation, i.e., the B-splines themselves are acting as a low pass filter so that no additional noise coming from the angles could drift into the residuals in that case. However, we were able to decrease the estimated Hurst parameter for case 1 to 0.69 (std 5×10 −4 ) by decreasing τ max to 10, i.e., decreasing the impact of the low frequencies. In this case, the B-spline LS system filters the low frequencies domain strongly, which could be accounted for by acting on τ max . This correction highlights the potential of our methodology to identify and filter model misspecification from the LS residuals. It is and remains based on a visual analysis of the PSD and an understanding of the residuals as prior to the estimation of the Hurst parameter. It is not recommended to use a bandpass Butterworth (or any other filter such as a notch filter) to filter specific frequencies. This was shown to strongly affect the determination of the Hurst exponent by creating an artificial decrease of frequencies amplitude in the middle of the frequency range, where a regular decrease is of main importance for the determination of H.

Application to Real Data
We propose to apply the proposed methodology to a real case scenario. Unfortunately, the true correlation structure is unknown; the development of a model based on a physical explanation of the TLS correlation is beyond the scope of this paper and led to further studies.
A white plane of size 1 m*1 m was scanned at a distance of 10 m with a Z+F 2016F using the scanning modus "extremely high", with which 1 Mio. per s can be recorded. The scanning configuration is presented in Figure 10 (left); it is optimal and corresponds to the simulated data with no tilt and the TLS pointing in the direction of the z-axis. The obtained point cloud was pre-processed to avoid edge effects and outliers, and cut using a free software. We finally approximated the data with a cubic B-spline, following the methodology presented in Section 2. The residuals for one scanning line are plotted in Figure 10 (right, top), together with the corresponding PSD ( Figure 10, right, bottom, blue line). We visually identified a cutoff frequency of log 10 ( f c ) =0, which we used to filter the residuals with a Butterworth filter (Figure 10, bottom, yellow line). As for the simulations, the results obtained with the three Hurst estimators proposed in this contribution differ. Without filtering, we found values of 0.85 for the GHE with τ max =20 (which was chosen due to the lack of additional low frequencies from inaccurate functional model), 1.01 (i.e., flicker noise) for the WhiE, and 0.61 for the WE. This last result highlights that the WE is affected by white noise-the value found was close to 0.5-and by the small number of observations used (900 per line). The tendency for the WhiE to overestimate the Hurst parameter with respect to the GHE (Table 1) is additionally shown. Using the visually identified cutoff frequency, the Hurst estimator was increased to 0.88 for the GHE, but stayed constant for the WE; this estimator is definitively not a relevant choice for the case study under consideration. A high value of 1.61 was found for the WhiE, which seems not usable with the filtered residuals, i.e., the WhiE being a spectral estimator is affected by the strong decrease of the PSD at high frequencies. From Figure 10 (right bottom) the plausibility of considering the noise of the TLS range as being fGn is confirmed; the blue line corresponds to a reference fGn of 0.9 and is nearly parallel to the yellow one from the filtered residuals.
This short case study validates the proposed methodology for a real case scenario: it is feasible to estimate the Hurst parameter from the range residuals of a plane scanned with a TLS. Further studies will be carried out in a next future to investigate more deeply the correlation structure. This latter is expected to depend on, e.g., the scanning rate, distance from the plane to the TLS, or atmospheric conditions.

Summary: A Methodology to Extract the Hurst Parameter from the B-Spline Residuals
In this section, we summarize our methodology to extract the Hurst parameter of the underlying noise from TLS range measurements from LS residuals ( Figure 11). We recall that the noise is simulated line wise as a fGn with a given Hurst parameter varying from 0.5-1 (persistent correlations). Working with real observations this assumption has to be tested by analyzing the stationarity of the time series as well as the power law of its PSD.
We start with the raw polar observations, which are to be transformed into Cartesian coordinates. After having parametrized the point cloud, a B-spline surface approximation is performed. The choice of the order of the B-splines is left to the user (e.g., cubic B-splines), as well as the method to fix the knot vector optimally or/and the number of CP to estimate. The residuals of the approximation are transformed backwards into polar coordinates; only the range residuals are further analyzed. They are plotted as a whole and line wise to visually identify the potential impact of model misspecification (low frequencies, repetitive pattern). These patterns could act-as a snow ball effect-on the determination of the Hurst parameter. As an important tool to understand the structure of the residuals, the PSD is plotted against the frequencies (a log-log plot should be used for a better visualization). Additional white noise or model misspecification can slide into the frequency domain; they are identified and filtered with a low pass Butterworth filter of first order. We recommend the use of the generalized Hurst estimator. This latter was shown to be robust to slight uncertainties in the determination of the cutoff frequency, as well as less sensitive to small samples effect, compared with the wavelet estimator. Thus, temporal variations of the Hurst exponent can be analyzed by making a line wise analysis of the Hurst parameters. Figure 11. Summary of the methodology to extract the Hurst parameter from the range residuals of a B-spline approximation from a TLS point cloud.

Conclusions
In this contribution, we have developed and validated an innovative yet simple strategy to extract the correlation structure of the underlying observation noise from the residuals of a B-spline surface approximation. This determination is neither based on least-squares estimation or collocation, nor parametric, and has the main advantages of being easy to use and computationally efficient.
Our case study dealt with TLS raw observations, having in mind to analyze the correlations of the range observations to perform more rigorous and trustworthy statistical tests for deformation of scanned objects. This is a highly relevant application for avoiding and/or quantifying the potential risk related to the deformation of structures such as dams or bridges. Moreover, knowledge of the correlation structure could serve to predict future deformations.
The range measurements of TLS observations are known to be temporally correlated. We guess from physical consideration that the power spectral density of the noise could be represented by a power law. The framework of LRD allows description of such kinds of noise accurately. In this study, we chose to model the correlation structure of the TLS range measurements by a stationary persistent fGn. The fGn is widely used to describe all kinds of noise in various domains and can be fully described by means of its Hurst exponent (related to the fractal dimension) and its variance. There exist various estimators for the Hurst exponent. In this contribution, we compared the performance of one of each family: the generalized Hurst estimator, the Whittle likelihood estimator and the wavelet estimator. We simulated small and longer samples, as well as observations noise with different Hurst exponents. Our goal was to determine as accurately as possible, from the B-splines range residuals, the reference parameter. Regression B-spline surface fitting can be applied to nearly every noisy and scattered point cloud, without limitation to specific surfaces such as circle or plane. Even if they are structurally correlated, the residuals of the approximation still contain information about the correlation and noise structure of the raw observations. Unfortunately, as in every approximation model, misspecifications are likely to arise. They introduce additional frequencies in the residuals, which affect the determination of the Hurst parameter. Simulating a plane, we identified unwanted white noise as strongly affecting the estimation. A low pass Butterworth filter of the first order applied to the residuals was able to correct the bias induced by an unwanted additional white noise. The generalized Hurst estimator was shown to be robust against slight over or underestimation of the cutoff frequency of the filter. The Whittle likelihood performs badly in estimating H , which was linked to the potential non-stationarity of the residuals, i.e., the assumption that the residuals should be fGn is mandatory for this estimator. The wavelet estimator performs ideally in absence of white noise and could be shown to be sensitive Figure 11. Summary of the methodology to extract the Hurst parameter from the range residuals of a B-spline approximation from a TLS point cloud.

Conclusions
In this contribution, we have developed and validated an innovative yet simple strategy to extract the correlation structure of the underlying observation noise from the residuals of a B-spline surface approximation. This determination is neither based on least-squares estimation or collocation, nor parametric, and has the main advantages of being easy to use and computationally efficient.
Our case study dealt with TLS raw observations, having in mind to analyze the correlations of the range observations to perform more rigorous and trustworthy statistical tests for deformation of scanned objects. This is a highly relevant application for avoiding and/or quantifying the potential risk related to the deformation of structures such as dams or bridges. Moreover, knowledge of the correlation structure could serve to predict future deformations.
The range measurements of TLS observations are known to be temporally correlated. We guess from physical consideration that the power spectral density of the noise could be represented by a power law. The framework of LRD allows description of such kinds of noise accurately. In this study, we chose to model the correlation structure of the TLS range measurements by a stationary persistent fGn. The fGn is widely used to describe all kinds of noise in various domains and can be fully described by means of its Hurst exponent (related to the fractal dimension) and its variance. There exist various estimators for the Hurst exponent. In this contribution, we compared the performance of one of each family: the generalized Hurst estimator, the Whittle likelihood estimator and the wavelet estimator. We simulated small and longer samples, as well as observations noise with different Hurst exponents. Our goal was to determine as accurately as possible, from the B-splines range residuals, the reference parameter. Regression B-spline surface fitting can be applied to nearly every noisy and scattered point cloud, without limitation to specific surfaces such as circle or plane. Even if they are structurally correlated, the residuals of the approximation still contain information about the correlation and noise structure of the raw observations. Unfortunately, as in every approximation model, misspecifications are likely to arise. They introduce additional frequencies in the residuals, which affect the determination of the Hurst parameter. Simulating a plane, we identified unwanted white noise as strongly affecting the estimation. A low pass Butterworth filter of the first order applied to the residuals was able to correct the bias induced by an unwanted additional white noise. The generalized Hurst estimator was shown to be robust against slight over or underestimation of the cutoff frequency of the filter. The Whittle likelihood performs badly in estimating H, which was linked to the potential non-stationarity of the residuals, i.e., the assumption that the residuals should be fGn is mandatory for this estimator. The wavelet estimator performs ideally in absence of white noise and could be shown to be sensitive to the choice of the cutoff frequency. We interpreted this behavior as being linked with non-averaging, compared with the GHE.
Simulating a Gaussian surface, the impact of model misspecification in the low frequency domain was highlighted and filtered adequately with a high pass Butterworth filter to improve the determination of the Hurst exponent.
For both simulated cases, similar conclusions were drawn; the Hurst exponent can be well determined with the GHE, provided that a prefiltering of the residuals with the smooth Butterworth filter of first order is performed. The cutoff frequency could be visually identified from the PSD of the residuals (line wise or as a whole). The feasibility of the proposed methodology was confirmed using real data from a plane scanned with a TLS with the "extremely high" resolution.
This powerful way to identify the noise structure from the residuals paves the way for a deeper study of the correlation dependency of TLS range measurements, independent of specific calibration procedures. Due to the high accuracy and precision of the determination of the fractal parameter, potential atmospheric parameters could be deduced from the B-spline residuals, as well as sensor characteristics. This analysis will be the topic of a later study with real data. The estimation of the range variance remains to be solved. A proposal could be based on the calibration of the LS system with white noise.