Stitching Locally Fitted T-Splines for Fast Fitting of Large-Scale Freeform Point Clouds

Parametric splines are popular tools for precision optical metrology of complex freeform surfaces. However, as a promising topologically unconstrained solution, existing T-spline fitting techniques, such as improved global fitting, local fitting, and split-connect algorithms, still suffer the problems of low computational efficiency, especially in the case of large data scales and high accuracy requirements. This paper proposes a speed-improved algorithm for fast, large-scale freeform point cloud fitting by stitching locally fitted T-splines through three steps of localized operations. Experiments show that the proposed algorithm produces a three-to-eightfold efficiency improvement from the global and local fitting algorithms, and a two-to-fourfold improvement from the latest split-connect algorithm, in high-accuracy and large-scale fitting scenarios. A classical Lena image study showed that the algorithm is at least twice as fast as the split-connect algorithm using fewer than 80% control points of the latter.


Introduction
In precision optical metrology, point cloud data processing enables high-precision shape measurement and surface reconstruction, which is important for applications requiring very high measurement accuracy, such as engineering quality control, product inspection, and reverse engineering [1][2][3].In the realm of smart sensing, point cloud data processing finds critical utility in environmental sensing and obstacle detection, addressing applications of paramount importance [4][5][6].Consequently, point cloud data processing plays a crucial role in both smart sensing and precision optical metrology, offering essential tools and a data foundation for resolving intricate challenges.Point cloud data consist of a large number of three-dimensional (3D) coordinate points, which are usually obtained by optical measurement devices, such as light detection and ranging (LIDAR) and 3D scanning [7].At present, advanced optical scanning has enabled the acquisition of over ten million 3D point clouds within a second [8,9].For precision metrology of surfaces with complex freeform geometry, dense point cloud acquisition has become an ordinary measurement routine, using which a denoised point cloud, polygon mesh, or CAD model is finally fitted [10].The spline technique has clearly become an indispensable part of surface fitting in point cloud data processing due to its many advantages such as smoothness, mathematical feasibility, and controllability [11].
Splines are widely used in interpolation, smoothing, and modeling in computer-aided geometric design, metrology, graphics, and digital signal processing [12].With the development of powerful node insertion and order elevation algorithms, B-splines and non-uniform rational B-splines (NURBS) have become the mainstream mathematical methods of shape of three computational steps with three localized operations.Firstly, LAST divides a largescale input point cloud into several local patches.Then each data patch is independently fitted to a T-spline through a locally accelerated fitting scheme.Lastly, the individually obtained T-splines are stitched using a local optimization scheme to a global output to ensure their mutual continuities.LAST results in concise T-mesh structures with a dramatic CP-saving performance by preserving the local support characteristics of CPs.We have compared the performance of LAST against the latest techniques, including a global fitting algorithm, a local fitting algorithm, and a split-connect algorithm.The results show that the computational speed is about two-to-eightfold faster than the compared algorithms.In the classical Lena RGB image case study, LAST is shown to be at least twice as fast as the split-connect algorithm, using fewer than 80% CPs of the latter.The proposed method can also be applied with advanced T-spline models, such as AST-splines, and hierarchical T-splines [18][19][20][21][22][23], providing a corresponding mesh refinement rule.

State-of-the-Art Algorithm Analyses 2.1. T-Spline Fundamentals
A 3D T-spline q(s, t) ∈ R 3 defined on a control mesh of n-CPs allowing T-junctions, known for the T-mesh as presented in Figure 1a, is similar to a NURBS in the form of where (s, t) ∈ R 2 is a parametric location mapped, e.g., using conformal mapping [36] (or simply normalized x and y coordinates), from a Cartesian measured point (x, y, z) to a parametric space; P is an n-row CP matrix, each row of which represents a CP p i ∈ R 3 ; b is a T-spline blending [17] vector with the i-th blending function entry b i (s, t) in the form of: b i (s, t) = N i,k (s) where N[u i ](s) and N[ν i ](t) are B-spline basis functions related to knot vectors u i and ν i , respectively.T-spline knot vectors are uniquely determined by the topology of a T-mesh obtained through Sederberg's T-spline Rule 1 [16], for example in Figure 1b.
In this paper, we propose a locally accelerated stitching T-spline (LAST) algorithm for large-scale freeform point cloud fitting.LAST breaks through the efficiency bottleneck of data scales by dividing a sizable fitting problem into several minor local fitting problems.LAST does not follow a top-down nor a bottom-up strategy but adopts both.LAST consists of three computational steps with three localized operations.Firstly, LAST divides a large-scale input point cloud into several local patches.Then each data patch is independently fitted to a T-spline through a locally accelerated fitting scheme.Lastly, the individually obtained T-splines are stitched using a local optimization scheme to a global output to ensure their mutual continuities.LAST results in concise T-mesh structures with a dramatic CP-saving performance by preserving the local support characteristics of CPs.We have compared the performance of LAST against the latest techniques, including a global fitting algorithm, a local fitting algorithm, and a split-connect algorithm.The results show that the computational speed is about two-to-eightfold faster than the compared algorithms.In the classical Lena RGB image case study, LAST is shown to be at least twice as fast as the split-connect algorithm, using fewer than 80% CPs of the latter.The proposed method can also be applied with advanced T-spline models, such as AST-splines, and hierarchical T-splines [18][19][20][21][22][23], providing a corresponding mesh refinement rule.

T-Spline Fundamentals
A 3D T-spline  (, ) ∈ ℝ defined on a control mesh of n-CPs allowing T-junctions, known for the T-mesh as presented in Figure 1a, is similar to a NURBS in the form of where (, ) ∈ ℝ is a parametric location mapped, e.g., using conformal mapping [36] (or simply normalized  and  coordinates), from a Cartesian measured point (, , ) to a parametric space;  is an n-row CP matrix, each row of which represents a CP  ∈ ℝ ;  is a T-spline blending [17] vector with the i-th blending function entry  (, ) in the form of: where   () and   () are B-spline basis functions related to knot vectors  and  , respectively.T-spline knot vectors are uniquely determined by the topology of a Tmesh obtained through Sederberg s T-spline Rule 1 [16], for example in Figure 1b.As presented in Figure 1a, a T-spline blending function is a locally supported Bspline basis function that can be recursively calculated using de Boor-Cox's recursion formulae [13].Given the knot vectors of a natural C 2 bicubic spline with the multiplicity of one, the recursively obtained blending functions have the following forms, with which repetitive recursion calculations can be avoided in practical calculations: for other s. ( T-splines break the topological limitation of B-splines and realize local refinement.This indicates that CPs can be locally allocated around a feature without inserting CPs in a whole row or column.T-splines are a generalization of NURBS, both of which can convert mutually.Research has shown that the CP amounts reduce to at least one quarter of the original in many computer-aided design cases when a NURBS simplifies to a T-spline, on the premise of a small accuracy difference [17].

Global T-Spline Fitting
A top-down global T-spline fitting normally approximates from a regular tensorproduct B-spline grid, such as in Figure 2. In most cases, a B-grid is extended around its boundaries using additional CPs for improved boundary geometrical control, including both surface heights and their partial derivatives [34].The regular grid is then locally refined by inserting CPs and splitting edges at the violating grids or elements where fitting errors exceed a preset threshold in a repetitive way.
[ ]( ) T-splines break the topological limitation of B-splines and realize local refinement.This indicates that CPs can be locally allocated around a feature without inserting CPs in a whole row or column.T-splines are a generalization of NURBS, both of which can convert mutually.Research has shown that the CP amounts reduce to at least one quarter of the original in many computer-aided design cases when a NURBS simplifies to a T-spline, on the premise of a small accuracy difference [17].

Global T-Spline Fitting
A top-down global T-spline fitting normally approximates from a regular tensorproduct B-spline grid, such as in Figure 2. In most cases, a B-grid is extended around its boundaries using additional CPs for improved boundary geometrical control, including both surface heights and their partial derivatives [34].The regular grid is then locally refined by inserting CPs and splitting edges at the violating grids or elements where fitting errors exceed a preset threshold in a repetitive way.During each iterative refinement, the CP matrix  updates by solving the following least-squares problem ( ) ( ) where  is an -by-3 measured point cloud matrix, each row representing an measured point  = ( ,  ,  );  = ( ,  ), … , ( ,  ) is an m-by-n T-spline blending matrix consisting of  ( ≫ ) blending vectors corresponding to each measured point in the direction of columns;  denotes the corresponding normalized version of B, known as the T-spline design matrix; B and  are generally sparse, whose sparsity patterns for matrix inversion are presented in Figure 3a, due to the local support of blending functions.During each iterative refinement, the CP matrix P updates by solving the following least-squares problem where Q is an m-by-3 measured point cloud matrix, each row representing an measured point q i = (x i , y i , z i ); B = [b(s 1 , t 1 ), . . . ,b(s m , t m )] T is an m-by-n T-spline blending matrix consisting of m (m n) blending vectors corresponding to each measured point in the direction of columns; B denotes the corresponding normalized version of B, known as the T-spline design matrix; B and B are generally sparse, whose sparsity patterns for matrix inversion are presented in Figure 3a, due to the local support of blending functions.In global strategies, if a violating grid element is refined, the newly inserted CPs and the original CPs are updated from all the input sample data through Equation ( 4).This global strategy results in tedious calculations, including repetitive design matrix updates and their pseudo-inverse.Figure 3c,d shows an experimental fitting analysis of a stainless steel surface in Figure 3b.It indicates that the spline model complexity quickly increased in the initial iteration cycles.During the last several cycles until the computation converged, the CP number grew slowly and finally approached zero.Simultaneously, the time cost for CP-solving continued to increase, and the design matrix s sparsity gradually converged to a small percentage.Figure 3c,d also shows that most computing resources focus on the pseudo-inverse-based CP-solving process.The sparsity patterns of the design matrices result in an efficiency-considerable computational complexity in ( .) for matrix inversion regarding the number of CPs.In large-scale point cloud and high-accuracy fitting cases, a design matrix may reach over a million rows and a thousand columns, resulting in it being nearly impossible to solve the least-squares problem.

Local T-Spline Fitting
Locally accelerated T-spline fitting, or simply local T-spline fitting, as an accelerated version of global algorithms, has been proposed in our previous work [27,34].The local algorithms follow the same top-down strategy as the global algorithms, as shown in Figure 4, including iterative mesh refinement and CP updates.The core difference is that the local algorithms only update the support-altered local CPs around error-violating meshes.In this way, the matrix multiplication   in Equation ( 1) disassembles into a varying part    with the local support-altered CPs to update, plus an unchanged part  ( )  , where M denotes the index set of the measured points participating in the fitting;  and   denote, respectively, the index sets of the support-altered and support-unchanged CPs.In AST-spline cases [27] where the partition of unity of blending functions always holds, the matrix partition can be simply written as Therefore, by substituting Equation (5) in Equation ( 4), the local CPs to update may be significantly accelerated by solving the pseudo-inverse of the size-reduced CP matrix  only, focusing on the limited areas with complex local geometries.It can also be noted that in AST-spline cases, the unchanged design matrix part  ( ) always stays the same as that of the last iteration, so repetitive design matrix updates are also avoided.With the local acceleration, a large-scale least-squares problem becomes minor and solvable.In global strategies, if a violating grid element is refined, the newly inserted CPs and the original CPs are updated from all the input sample data through Equation ( 4).This global strategy results in tedious calculations, including repetitive design matrix updates and their pseudo-inverse.Figure 3c,d shows an experimental fitting analysis of a stainless steel surface in Figure 3b.It indicates that the spline model complexity quickly increased in the initial iteration cycles.During the last several cycles until the computation converged, the CP number grew slowly and finally approached zero.Simultaneously, the time cost for CP-solving continued to increase, and the design matrix's sparsity gradually converged to a small percentage.Figure 3c,d also shows that most computing resources focus on the pseudo-inverse-based CP-solving process.The sparsity patterns of the design matrices result in an efficiency-considerable computational complexity in O n 2.2 for matrix inversion regarding the number of CPs.In large-scale point cloud and high-accuracy fitting cases, a design matrix may reach over a million rows and a thousand columns, resulting in it being nearly impossible to solve the least-squares problem.

Local T-Spline Fitting
Locally accelerated T-spline fitting, or simply local T-spline fitting, as an accelerated version of global algorithms, has been proposed in our previous work [27,34].The local algorithms follow the same top-down strategy as the global algorithms, as shown in Figure 4, including iterative mesh refinement and CP updates.The core difference is that the local algorithms only update the support-altered local CPs around error-violating meshes.In this way, the matrix multiplication BP in Equation ( 1) disassembles into a varying part BML P L with the local support-altered CPs to update, plus an unchanged part BM(N−L) P N−L , where M denotes the index set of the measured points participating in the fitting; L and N − L denote, respectively, the index sets of the support-altered and support-unchanged CPs.In AST-spline cases [27] where the partition of unity of blending functions always holds, the matrix partition can be simply written as Therefore, by substituting Equation (5) in Equation ( 4), the local CPs to update may be significantly accelerated by solving the pseudo-inverse of the size-reduced CP matrix P 1  L only, focusing on the limited areas with complex local geometries.It can also be noted that in AST-spline cases, the unchanged design matrix part BM(N−L) always stays the same as that of the last iteration, so repetitive design matrix updates are also avoided.With the local acceleration, a large-scale least-squares problem becomes minor and solvable.The local computation task may be further accelerated by only using a subset of the input m-point cloud, e.g., by only using the neighbor point sets within the support  .Figure 5a shows that the local fitting method reduces fitting time costs to half that of the global method in relatively high-accuracy cases, coinciding with previous studies.In lowaccuracy cases, an initial B-spline fit plus limited refinement iterations may converge, with the result that the efficiency superiority could be not noticeable.In addition, the efficiency improvement may also vanish due to the local computation characteristic if a fitted surface does not possess a sparse property [34], e.g., a surface has a uniform random spatial distribution of features, like hills and valleys.In the latter extreme situation or if the required fitting accuracy is very high, the local fitting algorithm may degenerate into a general global one.

Split-Connect Fitting
Split-connect algorithms [33] use a bottom-up approach to fit T-splines, thus avoiding conventional large-scale iteration problems.As presented in Figure 6, a split-connect algorithm includes the following three processes: The local computation task may be further accelerated by only using a subset of the input m-point cloud, e.g., by only using the neighbor point sets within the support P L . Figure 5a shows that the local fitting method reduces fitting time costs to half that of the global method in relatively high-accuracy cases, coinciding with previous studies.In low-accuracy cases, an initial B-spline fit plus limited refinement iterations may converge, with the result that the efficiency superiority could be not noticeable.In addition, the efficiency improvement may also vanish due to the local computation characteristic if a fitted surface does not possess a sparse property [34], e.g., a surface has a uniform random spatial distribution of features, like hills and valleys.In the latter extreme situation or if the required fitting accuracy is very high, the local fitting algorithm may degenerate into a general global one.The local computation task may be further accelerated by only using a subset of the input m-point cloud, e.g., by only using the neighbor point sets within the support  .Figure 5a shows that the local fitting method reduces fitting time costs to half that of the global method in relatively high-accuracy cases, coinciding with previous studies.In lowaccuracy cases, an initial B-spline fit plus limited refinement iterations may converge, with the result that the efficiency superiority could be not noticeable.In addition, the efficiency improvement may also vanish due to the local computation characteristic if a fitted surface does not possess a sparse property [34], e.g., a surface has a uniform random spatial distribution of features, like hills and valleys.In the latter extreme situation or if the required fitting accuracy is very high, the local fitting algorithm may degenerate into a general global one.

Split-Connect Fitting
Split-connect algorithms [33] use a bottom-up approach to fit T-splines, thus avoiding conventional large-scale iteration problems.As presented in Figure 6, a split-connect algorithm includes the following three processes:

Split-Connect Fitting
Split-connect algorithms [33] use a bottom-up approach to fit T-splines, thus avoiding conventional large-scale iteration problems.As presented in Figure 6, a split-connect algorithm includes the following three processes: 1.
Independent patch split: Iteratively fit an input point cloud to simple, small, and independent Bézier-or B-spline patches until a pre-set error threshold is achieved.Each spline patch corresponds to a rectangular grid of different sizes in a parameter domain.This step is similar to the refinement iteration of a global or local fitting scheme.The core difference is that the obtained patches here are topologically independent, i.e., the local support of a CP regarding its neighbor patches is neglected, and thus, the fitting of all the separate patches can be very fast.T-spline fitting: Calculate the T-spline CPs.Since CPs have been obtained in the first step, though they are inaccurate, optimization of the CPs using a preconditioned conjugate gradient (PCG) algorithm [31] has proven to be remarkably faster than general least squares.

Split-Connect Fitting
Split-connect algorithms [33] use a bottom-up approach to fit T-splines, thus avoiding conventional large-scale iteration problems.As presented in Figure 6, a split-connect algorithm includes the following three processes:  Split-connect has proved to be an order of magnitude faster than classical global algorithms, attributed to the local support ignorance during its initial patch split.However, the final T-spline optimization process reconsiders the local support characteristics of CPs and geometrical continuities between split patches, with the result that initially obtained fitting accuracy cannot hold to the end.An accuracy degeneration phenomenon shown in Figure 5b indicates that the final resulting T-spline's root-mean-square error (RMSE) is usually twice the value before a final patch connection, and it could be fourfold or even larger than a pre-set maximum fitting error threshold [10].Therefore, specifying a smaller error threshold is usually required in the initial split fitting to satisfy a final accuracy requirement.However, the support ignorance and the smaller error threshold setting in the initial independent split may easily result in superfluous CPs. Figure 5a indicates that the split-connect fitting may produce over twice the CP amounts of the top-down algorithms in high-accuracy cases.

LAST Fitting
This section introduces the proposed locally accelerated stitching T-spline fitting algorithm for fast, large-scale freeform point cloud fitting with a high capability of accuracy control.The algorithm combines bottom-up and top-down strategies into a common framework; hence, the advantages of both strategies can be benefited.The algorithm consists of three computational steps of localized acceleration operations, as presented in Figure 7a.Firstly, LAST divides a large-scale input point cloud into several local patches.Then each data patch is independently fitted to a T-spline through a locally accelerated fitting scheme.Lastly, the individually obtained T-splines are stitched using a local optimization scheme to a global output to ensure their mutual continuities.The distinction of LAST from the latest split-connect algorithm is that we segment a point cloud into minor patches for local T-spline fitting, rather than splitting it into totally independent B-spline patches.Also, in the final stitching process, only the local CPs around each patch boundary are locally updated using PCG optimization instead of updating all CPs.The local spline fit-andstitching behavior results in a balance of computational speed improvement and accuracy maintenance.The algorithm details are explained in the following.

Local Patch Segmentation
Providing an input 3D Cartesian point set to properly map to a parametric space [36], e.g.,  = 0,1 ⊗ 0,1 , this step divides the cloud into K minor data patches in the parametric space simply according to sample sizes for subsequent fitting within each patch.Each local patch then comprises an independent T-spline with local-support CPs.Hence, a large-scale fitting problem becomes multiple minor ones.During each local patch fitting, the design matrix dimension in rows reduces to 1/K of the original, significantly improving CP-solving speed according to the matrix inversion time curve in Figure 3d.
For a depth map with regular sample points, a uniform segmentation with the same patch size, i.e., the sample size of a patch, is usually expected.Figure 7b shows a four-level hierarchical dimidiate segmentation of a regularly sampled surface depth map.For an irregular point cloud, the segmentation should be adaptive to the distribution of its local point densities.To guarantee that the patches can be finally stitched in a T-spline space, the adaptively segmented patches must be topologically square in a parametric domain.In contrast, the patch sizes can vary, and T-junctions are allowed at their boundaries.
To achieve the highest efficiency, the patch size should be moderate.A too-large patch size results in the LAST fitting deteriorating to a brute-force global fitting; a toosmall one may result in the algorithm deteriorating to a bottom-up split-connect fitting with a uniform initial patch split.A numerical analysis regarding the influence of patch segmentation amount on computational time cost and fitting RMSE was carried out, as shown in Figure 7e.It shows that in the early period of patch number increasing, e.g., <512

Local Patch Segmentation
Providing an input 3D Cartesian point set to properly map to a parametric space [36], e.g., D = [0, 1] ⊗ [0, 1], this step divides the cloud into K minor data patches in the para- metric space simply according to sample sizes for subsequent fitting within each patch.Each local patch then comprises an independent T-spline with local-support CPs.Hence, a large-scale fitting problem becomes multiple minor ones.During each local patch fitting, the design matrix dimension in rows reduces to 1/K of the original, significantly improving CP-solving speed according to the matrix inversion time curve in Figure 3d.
For a depth map with regular sample points, a uniform segmentation with the same patch size, i.e., the sample size of a patch, is usually expected.Figure 7b shows a four-level hierarchical dimidiate segmentation of a regularly sampled surface depth map.For an irregular point cloud, the segmentation should be adaptive to the distribution of its local point densities.To guarantee that the patches can be finally stitched in a T-spline space, the adaptively segmented patches must be topologically square in a parametric domain.In contrast, the patch sizes can vary, and T-junctions are allowed at their boundaries.
To achieve the highest efficiency, the patch size should be moderate.A too-large patch size results in the LAST fitting deteriorating to a brute-force global fitting; a toosmall one may result in the algorithm deteriorating to a bottom-up split-connect fitting with a uniform initial patch split.A numerical analysis regarding the influence of patch segmentation amount on computational time cost and fitting RMSE was carried out, as shown in Figure 7e.It shows that in the early period of patch number increasing, e.g., <512 in this example, the computational time cost reduces until an optimal value is gradually achieved.Simultaneously, the final fitted T-spline RMSE increases gradually.In the larger patch number cases, the time cost grows rapidly, and fitting RMSE significantly decreases simultaneously.This is because the CP number in the latter cases approached the point cloud size, and a quasi-interpolation effect was obtained using a T-spline model.In this study, we used a patch size of 2000 points per patch, e.g., 32 patches in total, as shown in Figure 7a, to conduct the patch segmentation.

Locally Accelerated T-Spline Fitting
Each local point-set patch within a sub-parametric space D k ∈ D is then individually fitted to a T-spline through a locally accelerated fitting computation.The algorithm is similar to that proposed in previous studies [34,35] but with a few modifications, including mesh initialization and fitting equations.
This process indicates that a T-mesh is iteratively refined in localized areas during a local patch-fitting process, and CP calculation is accelerated by updating only the corresponding local CPs.These local behaviors bring the following benefits: (1) The local refinement behavior of a T-spline ensures control meshes are concise and adaptive to local feature complexities of measured geometry.Thus, limited unknown CPs are introduced during each iteration.(2) The local CP update behavior provides a further dimensional reduction of T-spline design matrix in rows and, thus, a further computational speed boost, according to the partial computation principle as in Equation ( 5). ( 3) By involving only the local points for CP update calculation, a further acceleration can be achieved due to a dimensional reduction of the design matrix in rows again.
The local T-spline fitting algorithm consists of the following four minor steps following the flow chart in Figure 4.The last three steps are iteratively applied until all measured points meet a pre-set tolerance.

T-spline initialization
As shown in Figure 7c, a simple square grid consisting of four CPs overlaid on F is used as the initial T-mesh of fitting.Boundary extensions [34] are neglected here to ensure patch boundaries are able to be stitched later without topological hindrances.Other tensor-product B-spline grids can also be used in this step for speed optimization.The initial spline can be easily fitted regarding its CPs using Equation (4), by providing the local measured sample points.

Tolerance check
To improve fitting accuracy, a T-mesh is iteratively refined to fit an input point cloud with a desired tolerance σ.For this purpose, a metric for fitting error evaluation, e.g., max, average, and root-mean-square Euclidean distances between the fitted points Q from the measured points Q, are calculated.Then the violating points with a fitting error larger than σ and their located mesh patches, e.g., the yellow element in Figure 7c, are identified for further local refinement.If no violating points are found, the iteration stops.In this study, the maximum Euclidean distance from a fitted spline to measured points is used as the threshold metric for the tolerance check.

Local refinement
Violating T-mesh patches are split in a dimidiate way for local T-mesh refinement in this step.As presented in Figure 7c, if a rectangle element is split, two associated new CPs are inserted simultaneously.The splitting direction is selected along the longer edge, so a T-mesh retains isotropy during refinement.The dimidiate mechanism and initial rectangle B-spline grids guarantee that all refined patches are always rectangles.To avoid rank-deficiency in later fitting calculations, refinement is only implemented at the patches with more than a specified number (e.g., eight in the study) of data points.

Local fitting
A refined T-spline on D k is then locally fitted regarding the newly inserted and supportaltered CPs.Along with the insertion of a splitting edge with two new CPs, the blending functions of their surrounding CPs (see Figure 7c, for example) are then altered.This indicates that the blending matrix B, after a local refinement, extends by several new columns, and a limited number of its original columns change.Following the convention in Section 2.3 by indexing the support-alter and newly inserted CPs using L and the remaining CPs using N-L, Equation ( 5) can be adapted to the general T-splines of Equation (1) in the following form: where M − denotes an attending subset of measured points, e.g., in the affected elements of Figure 7c, in the accelerated local fitting.The unchanged blending matrix part B M − (N−L) with its associated CPs P N−L can be simply inherited from the last iteration.Thus, if the blending matrix B M − L for the L-indexed CPs and the corresponding summations ∑ i b i (•) are calculated, P L can thus be efficiently estimated from the following small-scale least-squares problem:

Locally Optimized T-Spline Stitching
A global T-spline is obtained finally in the step by stitching all the individual patch T-splines.The global T-spline stitching includes a T-mesh stitching process and a subsequent global T-spline fitting.The T-mesh stitching is carried out by sharing the knots at the overlapping boundaries of two neighbor T-meshes.For example, as presented in Figure 7d, we assume that s 2 is a parametric position where a left T-mesh and a right one will stitch.The knot vector of the left mesh at s 2 is [• Nevertheless, it results in an accuracy loss of finally stitched T-splines due to a reduced total CP number [33].Extending a boundary CP to the inner edges of a neighboring T-mesh may mitigate the problem.
The set union operation also alters the knot vectors and the supports of boundary CPs.Re-fitting the CPs at the stitching boundaries and their nearby areas is thus needed to guarantee the resulted T-spline accuracy.For this problem, the locally accelerated fitting scheme described in Figure 7c and Equation ( 7) is used again for a speed concern, instead of updating all spline CPs crassly.However, it should be noted that since the boundary CPs have already been estimated, the local CP-solving problem here is further accelerated using PCG optimization [31] instead of using least squares, as in Section 3.2.Since the stitching process involves CP optimization around patch boundaries, a too-large initial patch segmentation should not be expected as it produces excessive CPs around patch boundaries.Figure 7e shows that the computational time cost in the extreme cases significantly increases because nearly all the CPs are close to patch boundaries and need to be re-fitted.

Experiments
The LAST fitting algorithm was tested regarding accuracy requirements and data scales, in comparison with a global, a local, and a split-connect fitting algorithm.The experiments were carried out on a Lenovo computer (Intel i7, 2.8 GHz CPU, and 32 GB memory) under a MATLAB 2020a environment without parallel acceleration.Typical surfaces, including a random stainless steel surface, periodically laser-patterned hard-disk surface [32], and classical RGB portrait of Lena, were analyzed.The former two are used to uncover the performance of LAST thoroughly.The Lena image is used as a benchmark for performance comparison against the state-of-the-art fast split-connect algorithm.to uncover the performance of LAST thoroughly.The Lena image is used as a benchmark for performance comparison against the state-of-the-art fast split-connect algorithm.

High-Accuracy Fitting
A fitting accuracy analysis of the stainless steel surface was carried out.Typical fitted T-meshes and corresponding fitting error maps of the analyzed algorithms with an equivalent fitting error are presented in Figure 8.It shows that the LAST fitting algorithm produced a T-spline and an error map similar to the global and local algorithms without generating excessive CPs around the patch boundaries.The complex T-spline model obtained from the split-connect algorithm has minor fitting errors in the central area of the parametric space but significant errors at the space boundaries.Conversely, LAST exhibits larger errors in regions with high gradients.All the fitted T-meshes have dense CP distributions around the geometrical features.x y q x y q x y q x y where the scalers (, ),  (, ) ∈ ℝ describing the heights of a z-map surface are onedimensional versions of  and  in Equations ( 1)-(4).Figure 9a,b show the typical fitting results of the laser-patterned surface and stainless steel surface.The adaptively fitted Tsplines present relatively uniform Euclidian distance point-to-point errors across the surfaces.Figure 9c,d indicate that for both the periodically patterned and randomly patterned surfaces, LAST fitting achieves excellent efficiency performance against the state-of-theart fitting algorithms.Specifically, in the low-accuracy fitting scenarios, LAST performs faster than the global and local algorithms but slightly worse than the split-connect algorithm.When the fitting accuracy requirement becomes higher, the split-connect algorithm gradually becomes inefficient because the final PCG optimization is time-consuming in large-scale cases.On the contrary, the LAST fitting gradually becomes the fastest algorithm in high-accuracy cases, thanks to the local support reservation in the fitting calculations.The efficiency performance in terms of fitting accuracy indexed by the signal-noise ratio of the test algorithms was then analyzed where the scalers q(x, y), q(x, y) ∈ R describing the heights of a z-map surface are onedimensional versions of q and q in Equations ( 1)-(4).Figure 9a,b show the typical fitting results of the laser-patterned surface and stainless steel surface.The adaptively fitted T-splines present relatively uniform Euclidian distance point-to-point errors across the surfaces.Figure 9c,d indicate that for both the periodically patterned and randomly patterned surfaces, LAST fitting achieves excellent efficiency performance against the state-of-the-art fitting algorithms.Specifically, in the low-accuracy fitting scenarios, LAST performs faster than the global and local algorithms but slightly worse than the splitconnect algorithm.When the fitting accuracy requirement becomes higher, the splitconnect algorithm gradually becomes inefficient because the final PCG optimization is time-consuming in large-scale cases.On the contrary, the LAST fitting gradually becomes

Lena Image Comparison
We follow the convention of the state-of-the-art research [33] to fit the 8-bit RGB Lena image using the proposed algorithm.Figure 11 (top) presents typical fitted T-meshes of the two algorithms with an equivalent RMSE of 0.03.The results show that the LAST fitting produces a significantly simpler T-mesh than the split-connect fitting using only 80% control points of the latter.An efficiency analysis against fitting accuracy and sample set sizes was conducted.The results in Figure 11 (bottom) show that LAST is about twice to even over fourfold faster than the split-connect algorithm in high-accuracy scenarios.

Lena Image Comparison
We follow the convention of the state-of-the-art research [33] to fit the 8-bit RGB Lena image using the proposed algorithm.Figure 11 (top) presents typical fitted T-meshes of the two algorithms with an equivalent RMSE of 0.03.The results show that the LAST fitting produces a significantly simpler T-mesh than the split-connect fitting using only 80% control points of the latter.An efficiency analysis against fitting accuracy and sample set sizes was conducted.The results in Figure 11 (bottom) show that LAST is about twice to even over fourfold faster than the split-connect algorithm in high-accuracy scenarios.

Conclusions and Future Work
In this paper, a locally accelerated stitching T-spline, known as LAST, fitting algorithm is proposed for fast, large-scale freeform point cloud fitting.The method first segments a large-scale input point cloud into small local patches.Then it fits each patch of data individually to a T-spline through a locally accelerated fitting scheme.Finally, the individually obtained T-splines are stitched to a global output using a local optimization scheme around patch boundaries.The local fit-and-stitch strategy breaks the efficiency bottleneck of the existing global and local fitting algorithms about data scales.LAST also results in concise T-mesh structures with a considerable balance between fitting efficiency and accuracy control by preserving the local support characteristics of CPs.
The efficiency performance of LAST has been validated through experiments compared to a global, a local, and a split-connect T-spline fitting algorithm.The results show that the computational speed is about two0to-eightfold faster than the three selected latest algorithms.In the classical Lena RGB image study, LAST is shown to be at least two times faster than the split-connect algorithm, using fewer than 80% CPs of the latter.The proposed method can also be applied with advanced T-spline models, such as AST-splines, and hierarchical T-splines, providing a corresponding mesh refinement rule.
The solution to the issue of larger errors in high-gradient regions for LAST will be explored and discussed in depth in our future work.Simultaneously, considering the

Conclusions and Future Work
In this paper, a locally accelerated stitching T-spline, known as LAST, fitting algorithm is proposed for fast, large-scale freeform point cloud fitting.The method first segments a large-scale input point cloud into small local patches.Then it fits each patch of data individually to a T-spline through a locally accelerated fitting scheme.Finally, the individually obtained T-splines are stitched to a global output using a local optimization scheme around patch boundaries.The local fit-and-stitch strategy breaks the efficiency bottleneck of the existing global and local fitting algorithms about data scales.LAST also results in concise T-mesh structures with a considerable balance between fitting efficiency and accuracy control by preserving the local support characteristics of CPs.
The efficiency performance of LAST has been validated through experiments compared to a global, a local, and a split-connect T-spline fitting algorithm.The results show that the computational speed is about two0to-eightfold faster than the three selected latest algorithms.In the classical Lena RGB image study, LAST is shown to be at least two times faster than the split-connect algorithm, using fewer than 80% CPs of the latter.The proposed method can also be applied with advanced T-spline models, such as AST-splines, and hierarchical T-splines, providing a corresponding mesh refinement rule.

Figure 1 .
Figure 1.The principle of T-splines.(a) T-spline blending functions; (b) the T-mesh and the knots (red squares) of the blending function for control point  corresponding to (a).

Figure 1 .
Figure 1.The principle of T-splines.(a) T-spline blending functions; (b) the T-mesh and the knots (red squares) of the blending function for control point P 1 corresponding to (a).

Figure 2 .
Figure 2. Flowcharts of the global T-spline fitting algorithms.

Figure 2 .
Figure 2. Flowcharts of the global T-spline fitting algorithms.

Figure 3 .
Figure 3. Performance analysis of typical T-spline fitting algorithms for a stainless steel complex surface.(a) Sparsity view (top) and corresponding reverse Cuthill-McKee ordering (bottom) of the transposed product of design matrices   during each iteration cycle in a global fitting scheme; (b) the stainless steel complex surface; (c,d) time consumption statistics and matrix inversion time costs for a global T-spline fitting calculation.

Figure 3 .
Figure 3. Performance analysis of typical T-spline fitting algorithms for a stainless steel complex surface.(a) Sparsity view (top) and corresponding reverse Cuthill-McKee ordering (bottom) of the transposed product of design matrices BT B during each iteration cycle in a global fitting scheme; (b) the stainless steel complex surface; (c,d) time consumption statistics and matrix inversion time costs for a global T-spline fitting calculation.

Figure 4 .
Figure 4. Flowcharts of the local T-spline fitting algorithms.

Figure 5 .
Figure 5. Performance analysis of typical T-spline fitting algorithms for a stainless steel complex surface from Figure 3b.(a) Resulting model CP statistics of different algorithms; (b) the fitting accuracy degeneration phenomenon in split-connect calculations.Dashed trend lines are fitted using moving averages.

Figure 4 .
Figure 4. Flowcharts of the local T-spline fitting algorithms.

Figure 4 .
Figure 4. Flowcharts of the local T-spline fitting algorithms.

Figure 5 .
Figure 5. Performance analysis of typical T-spline fitting algorithms for a stainless steel complex surface from Figure 3b.(a) Resulting model CP statistics of different algorithms; (b) the fitting accuracy degeneration phenomenon in split-connect calculations.Dashed trend lines are fitted using moving averages.

Figure 5 .
Figure 5. Performance analysis of typical T-spline fitting algorithms for a stainless steel complex surface from Figure 3b.(a) Resulting model CP statistics of different algorithms; (b) the fitting accuracy degeneration phenomenon in split-connect calculations.Dashed trend lines are fitted using moving averages.
connection: Connect independent spline patches to a T-mesh with a prescribed continuity according to actual data continuity (to avoid missing data problems) and infer the knot vectors of all CPs. 3.

Figure 5 .
Figure 5. Performance analysis of typical T-spline fitting algorithms for a stainless steel complex surface from Figure 3b.(a) Resulting model CP statistics of different algorithms; (b) the fitting accuracy degeneration phenomenon in split-connect calculations.Dashed trend lines are fitted using moving averages.

Figure 6 .
Figure 6.Flowcharts of the split-connect T-spline fitting algorithms.

Figure 6 .
Figure 6.Flowcharts of the split-connect T-spline fitting algorithms.

Figure 7 .
Figure 7. Diagram of LAST fitting algorithm.(a) Algorithm flow chart; (b) independent patch segmentation of a stainless steel surface topography image for local fit; (c) local dimidiate T-spline refinement from a four-CP initial square grid (bold boundaries) and fitting with localized CP topology analysis; (d) stitching at the boundaries of two neighbor T-meshes for locally accelerated final Tspline optimization; (e) computational time cost and fitting RMSE analysis regarding initially segmented patch numbers for the stainless steel surface.In (c), a T-mesh element in yellow is split, and two new CPs in red are inserted.Due to the local support characteristics, only the neighbor (blue) CP-associated blending functions alter.Fitting the local CPs using the local data in the affected grids in grey reduces the complexity of computation significantly; more information is available in the previous work [34].

Figure 7 .
Figure 7. Diagram of LAST fitting algorithm.(a) Algorithm flow chart; (b) independent patch segmentation of a stainless steel surface topography image for local fit; (c) local dimidiate T-spline refinement from a four-CP initial square grid (bold boundaries) and fitting with localized CP topology analysis; (d) stitching at the boundaries of two neighbor T-meshes for locally accelerated final T-spline optimization; (e) computational time cost and fitting RMSE analysis regarding initially segmented patch numbers for the stainless steel surface.In (c), a T-mesh element in yellow is split, and two new CPs in red are inserted.Due to the local support characteristics, only the neighbor (blue) CPassociated blending functions alter.Fitting the local CPs using the local data in the affected grids in grey reduces the complexity of computation significantly; more information is available in the previous work [34].

4. 1 .
High-Accuracy Fitting A fitting accuracy analysis of the stainless steel surface was carried out.Typical fitted Tmeshes and corresponding fitting error maps of the analyzed algorithms with an equivalent fitting error are presented in Figure 8.It shows that the LAST fitting algorithm produced a T-spline and an error map similar to the global and local algorithms without generating excessive CPs around the patch boundaries.The complex T-spline model obtained from the split-connect algorithm has minor fitting errors in the central area of the parametric space but significant errors at the space boundaries.Conversely, LAST exhibits larger errors in regions with high gradients.All the fitted T-meshes have dense CP distributions around the geometrical features.Sensors 2023, 23, x FOR PEER REVIEW 11 of 15

Figure 8 .
Figure 8. Fitted T-meshes (top) with an RMSE of ~0.02 µm and corresponding fitting error maps (bottom) for a stainless steel surface.Each column from left to right denotes the global, local, splitconnect, LAST fitting result, and the 256 × 256-pixel input map, respectively.The efficiency performance in terms of fitting accuracy indexed by the signal-noise ratio of the test algorithms was then analyzed ( ) ( ) ( ) 2 1 1 10 2 1 1 , SNR 10 log , , x

Figure 8 .
Figure 8. Fitted T-meshes (top) with an RMSE of ~0.02 µm and corresponding fitting error maps (bottom) for a stainless steel surface.Each column from left to right denotes the global, local, splitconnect, LAST fitting result, and the 256 × 256-pixel input map, respectively.

Figure 11 .
Figure 11.Lena image fitting analysis.Top: Each column from left to right represents the input image, the fitted image, and corresponding 32,140-CP T-mesh of the split-connect fitting, the fitted image, and corresponding 24,908-CP T-mesh of the LAST fitting.Bottom: fitting efficiency analysis against accuracies with a data scale of 512  512 pixels (a), and analysis against image resolutions with a fitting RMSE of 0.022 (b).Dashed trend lines are fitted using moving averages.

Figure 11 .
Figure 11.Lena image fitting analysis.Top: Each column from left to right represents the input image, the fitted image, and corresponding 32,140-CP T-mesh of the split-connect fitting, the fitted image, and corresponding 24,908-CP T-mesh of the LAST fitting.Bottom: fitting efficiency analysis against accuracies with a data scale of 512 × 512 pixels (a), and analysis against image resolutions with a fitting RMSE of 0.022 (b).Dashed trend lines are fitted using moving averages.