Multi-Fidelity Information Fusion to Model the Position-Dependent Modal Properties of Milling Robots

Robotic machining is a promising technology for post-processing large additively manufactured parts. However, the applicability and efficiency of robot-based machining processes are restricted by dynamic instabilities (e.g., due to external excitation or regenerative chatter). To prevent such instabilities, the pose-dependent structural dynamics of the robot must be accurately modeled. To do so, a novel data-driven information fusion approach is proposed: the spatial behavior of the robot’s modal parameters is modeled in a horizontal plane using probabilistic machine learning techniques. A probabilistic formulation allows an estimation of the model uncertainties as well, which increases the model reliability and robustness. To increase the predictive performance, an information fusion scheme is leveraged: information from a rigid body model of the fundamental behavior of the robot’s structural dynamics is fused with a limited number of estimated modal properties from experimental modal analysis. The results indicate that such an approach enables a user-friendly and efficient modeling method and provides reliable predictions of the directional robot dynamics within a large modeling domain.


Introduction
To achieve the climate targets set by the European Commission, formulated in the The European Green Deal in 2019 [1], modern production processes must continuously become more energy and resource-efficient. In the aerospace industry especially, where the production of large structural components is resource-intensive, additive manufacturing processes promise high resource efficiency [2]. Nevertheless, post-processing of the workpieces, for example by milling, is still required to achieve high geometrical accuracy [3].
In such scenarios, machining robots are the ideal platforms to follow the additive manufacturing processes in the process chain, because they offer a large workspace and at the same time low investment costs compared to conventional machine tools [4,5]. However, their low dynamic stiffness continues to limit their industrial application [6][7][8]. On the one hand, forced vibrations (e.g., when a tooth passing frequency coincides with a robot's natural frequency) can cause significant damage to the tool and spindle components. On the other hand, self-excited regenerative chatter can limit the depth of cut and therefore reduce the robot's efficiency because of a limited material removal rate. Additionally, unstable machining processes result in poor surface quality of the workpiece.
Hence, modeling the (pose-dependent) vibrational properties of machining robots is of high interest. In this work, a novel data-driven approach for modeling the pose-dependent dynamics of milling robots is presented.

Related Work
In recent years, various methods for modeling the pose-dependent vibrational properties of machining robots have been proposed. Roughly, the proposed approaches can be divided into classical physics-based methods and data-driven algorithms:

Physics-based methods
Physics-based models are based on classical mechanistic modeling techniques, such as • conventional rigid-body models [9], • rigid-body models with additional virtual joints [10] or • finite element models [11].
Mostly, these models can be formulated in the following form: The model's generalized coordinates q represent the rotational degrees of freedom of the robot joints. The pose-dependent mass matrix M consists of the robot bodies' mass and inertia properties, while the system's damping and stiffness matrices C and K are usually considered to be pose-independent. Lastly, τ are the resulting torques around the model's degrees of freedom.
As presented by Huynh et al. [9], the modeling accuracy of the vibrational behavior can be significantly improved, if q not only includes the actuated degrees of freedom around the actuated axes, but also tilting rotations (resulting in three rotational degrees of freedom for each physical robot joint). However, the parameter identification process for the model parameters in M(q), K and C remains an open issue, because conventional parameter identification procedures often result in physically infeasible parameters and, thus, should only be treated as fitting parameters [9]. Based on the popular approach of Dumas et al. [12], Busch et al. [13] presented a methodology to infer physically feasible compliance parameters using a Bayesian regression approach. Additionally, Huynh et al. [9] proposed a methodology to estimate the model's mass and inertia properties.
However, all physics-based approaches rely on complex experimental designs, costintensive measurement technology (such as dynamometers and laser trackers), and mathematically advanced identification schemes to estimate the pose-dependent dynamics accurately, which prevent industrial applications.

Data-driven algorithms
In contrast to physics-based methods, recent works aimed at deriving the posedependent dynamics of machining robots solely from data. The use of modern machine learning algorithms promises a simpler and more efficient system modeling approach [14]. Furthermore, when probabilistic algorithms such as Gaussian process regression techniques are used, the resulting data-driven models can take reliable uncertainty measures into account. For example, Nguyen et al. [14] proposed a data-driven approach using Gaussian process regression to model the robot's position-dependent modal properties (natural frequency, damping ratio and dynamic stiffness). Based on such a model, an active vibration suppression controller was developed [15].
However, conventional data-driven modeling techniques usually require expert knowledge during the algorithm setup. Typically for conventional Gaussian process models, the underlying covariance function and the noise model must be carefully chosen and designed based on domain knowledge. If the algorithm's hyperparameters (such as the covariance function) are not well designed, the data-driven models cannot predict the robot's dynamics accurately.

Motivation
Summarizing, a precise and yet efficient modeling technique for the pose-dependent vibrational dynamics remains an open issue, even though multiple approaches provide methods to model the dynamics with different levels of fidelity and model efficiency.
As shown in previous works, the problem at hand motivates the application of socalled multi-fidelity information fusion schemes [16]. Such algorithms are especially appealing, when multiple information sources with different levels of information quality are available [17,18]. Typically for general dynamical systems, there are multiple approaches for modeling a system's dynamic behavior, each with a different level of fidelity. Information fusion schemes can take advantage of the linear or even nonlinear relationship between the two or more information sources. Hence, these algorithms can reduce the burden of evaluating the high fidelity model for all system states of interest, by exploiting the fact that the low-fidelity model already contains a rough estimate.
Regarding the given issue, analytical structural models of an industrial machining robot can provide essential information on the underlying system dynamics (even though their accuracy in respect of the positional vibration properties might be limited), while a limited number of experimental data samples will provide precise information on the actual system behavior. An information fusion scheme allows to model the spatial system dynamics by fusing both information sources [16]. However, the previously proposed method is based on frequency domain data. As Gaussian process regression algorithms scale poorly with the number of data points, the proposed methodology was limited by the number of data samples.
The idea of fusing this kind of information was recently adopted by Liu and Altintas [19], who proposed a transfer learning scheme using artificial neural networks to model the pose-dependent dynamics of conventional machine tools.

Scope and Approach
Based on the works of Nguyen et al. [14] and Busch et al. [16], a novel methodology for modeling the position-dependent modal properties of a machining robot is proposed in the following sections: based on a multi-fidelity scheme, a precise model of the robot's vibrational properties is developed. An existing, but imprecise analytical robot model provides low-fidelity information, whereas experimental modal analysis (EMA) experiments provide accurate high-fidelity data.
As the spatial (i.e., position-dependent) model of the robot's natural frequencies is the key component for predicting whether a machining process will be stable or to suppress vibrations using active vibration control, the natural frequencies are the key focus of this paper and are therefore considered as a primary feature.
To evaluate the vibrational behavior at any location in the workspace, the damping ratios and mode shape vectors (if physically scaled, these are also called residues) are also of interest, for example to reconstruct the position-based frequency response function. Hence, those two features are considered to be secondary features.
The paper at hand is structured as follows: the methodology is described in detail in Section 2. The results are presented in Section 3. Following that, the results and their implications are discussed in Section 4, while conclusions of this study are presented in Section 5.

Methods
The presented methodology aims at generating probabilistic models of the vibrational properties of machining robots. For this purpose, the spatial vibrational behavior of such a machining robot is represented by the following three modal properties of the robot's structure, depending on the workspace position in a horizontal plane (x := [x, y, z = const.] T ): • the natural frequencies ω i (x) for each vibration mode i = 1, . . . , M of the robot structure, • the damping ratios ξ i (x) for each vibration mode i = 1, . . . , M of the robot structure and • the residues R i,d (x) for each vibration mode i = 1, . . . , M and in each mode direction d ∈ {xx, xy, xz, yx, yy, yz, zx, zy, zz}.
To model the spatial behavior of the robot's natural frequencies accurately, two probabilistic information fusion schemes are presented and benchmarked against a conventional approach.
Therefore, the outline of the methodology is as follows: 1. Data generation: first, the modal parameters are derived from the analytical model and gathered experimentally at the robot (see Section 2.1).

2.
Data preparation: the data sets are divided into training and testing data sets (see Section 2.2).

3.
Model setup and training: the spatial behavior of the vibrational features is modeled as follows (see Section 2.3): (a) The primary features (natural frequencies) are modeled using multi-fidelity information fusion approaches (see Section 2.3.1).
The secondary features (damping ratios and residues) are modeled using conventional Gaussian process regression techniques (see Sections 2.3.2 and 2.3.3).
The methodology is illustrated in Figure 1.
Step Step 2: Data preparation Step 3: Model training

Data Generation: Spatial Modal Parameter Identification
All of the following investigations and experiments were conducted using a KUKA KR240 R2500 PRIME robot. The robot is equipped with a high-speed milling spindle from Helmuth Diebold GmbH & Co. (type HSG-E 198.18-38 AK1). All the following coordinates are referenced to the robot's base coordinate system.
To obtain experimental training and testing data, impact hammer tests were conducted for 35 robot configurations in the x-y-plane at a constant height z of 1.24 m with a constant TCP orientation, followed by the estimation of the modal properties using the LEAST-SQUARES COMPLEX FREQUENCY DOMAIN (LSCF) algorithm [20]. The measurement locations are illustrated in Figure 2. The coordinates are given in the Appendix A (see Table A1).  A KISTLER impact hammer (type 9728A20000, force range up to 20 kN) and a KISTLER 3D-accelerometer (type 8762A50) were used. The accelerometer was placed on the spindle. The excitations created by the impact hammer were made in each spatial direction (x-, y-and z-direction) and performed 10 times each to reduce the measurement noise and nonlinear effects. As described by Puzik [21] and Huynh et al. [9], the effect of the active motor controller is negligible. Therefore, the measurements were performed with brakes engaged.  shown. Illustrations of the cross residuals R 1,xy and R 1,yx are given in the appendix (see 183 Figure A1). To visualize the spatial behavior of the robot's vibrational properties, the modal parameters of the first vibration mode are illustrated in Figure 4. For simplicity, only the spatial behavior of the direct residues in the x-and y-directions, R xx and R yy , are shown. Illustrations of the cross residuals R 1,xy and R 1,yx are given in the Appendix B (see Figure A1).
As illustrated, the natural frequency ω 1 changes continuously depending on the tool center point's position: ω 1 decreases, as the tool center point (TCP) moves away from the robot's base.
Roughly, the damping ratio ξ 1 increases with increasing distance of the TCP to the robot's base, even though clear spatial behavior is not apparent in the data.
The imaginary part of R 1,xx decreases approximately linearly in the y-direction and is approximately constant in the x-direction, while the imaginary part of R 1,yy decreases with increasing x-direction but increases with increasing y-direction. The real parts of R 1,xx and R 1,yy are significantly smaller than their respective imaginary parts (indicating a rather lightly damped structure), but also significantly noisier. Roughly, the damping ratio ξ 1 increases with increasing distance of the TCP to the 188 robot's base, even though clear spatial behavior is not apparent in the data.

189
The imaginary part of R 1,xx decreases approximately linearly in the y-direction and To train and test the probabilistic models, the split into training and testing data 196 must be carefully chosen: the training data must include as much information on the 197 spatial behavior as possible and enough testing data points must be included to ensure 198 the validity of the predicted model behavior.

200
The sampling methodology applied in this study involves two steps:

201
In a first step, the general validity of the approach is demonstrated using 15 training 202 data points N train (and 20 testing data points N test , respectively). In the following section, 203 this training and testing split is referred to as sampling strategy A.

Data Preparation: Sampling Methodology
To train and test the probabilistic models, the split into training and testing data must be carefully chosen: the training data must include as much information on the spatial behavior as possible and enough testing data points must be included to ensure the validity of the predicted model behavior.
The sampling methodology applied in this study involves two steps: In a first step, the general validity of the approach is demonstrated using 15 training data points N train (and 20 testing data points N test , respectively). In the following section, this training and testing split is referred to as sampling strategy A.
To do so, conventional sampling techniques, such as Latin-Hypercube-Sampling (LHS) or Sobol-Sampling, can be used. In the following, the application of an LHS approach is described: first, the division into D train and D test is performed using an adapted LHS technique in the robot's workspace (i.e., an LHS approach and maximizing the minimum distance between all points).
In a second step, the sensitivity of the model accuracy to the number of training data points is investigated. Hence, this will investigate how an increasing number of training data points improves the modeling accuracy. The following sampling approach is referred to as sampling strategy B N train .
Depending on the desired number of data points in the actual training data set, random samples are taken iteratively from D train . Hence, the actually used training data set is sampled in each iteration into D train, used and D train, unused , conditional upon D train, used ∪ D train, unused = D train . ( To ensure the comparability between the results, the following principles are taken into account for sampling the training data set in each iteration: • The testing data set D test remains the same for all investigations. • The actually used training data points D train, used are subsampled from the original training data set D train . Figure 5 illustrates the training and testing locations for the sampling strategy A and for 3, 5, 7, and 10 training data points in the sampling strategy B. It is worth noting, that for 15 training data points, the sampling strategy B 15 equals the sampling strategy A. adapted LHS technique in the robot's workspace (i.e., an LHS approach and maximizing 208 the minimum distance between all points).

210
In a second step, the sensitivity of the model accuracy to the number of training 211 data points is investigated. Hence, this will investigate how an increasing number of 212 training data points improves the modeling accuracy. The following sampling approach 213 is referred to as sampling strategy B N train .

214
Depending on the desired number of data points in the actual training data set, random samples are taken iteratively from D train . Hence, the actually used training data set is sampled in each iteration into D train, used and D train, unused , conditional upon D train, used ∪ D train, unused = D train . (2) To ensure the comparability between the results, the following principles are taken 215 into account for sampling the training data set in each iteration: The testing data set D test remains the same for all investigations.   As presented in section 1, the concept of multi-fidelity information fusion shall be transferred to modeling of the spatial behavior of the robot's natural frequencies: the low fidelity information is provided by the rigid body model of the robot. Based on the model's mass and stiffness matrices M(q) and K, the position-dependent natural frequenciesω i are the eigenvalues of the system's eigenvalue problem [22]: (3)

Model Setup and Training: Data Driven Modeling of the Modal Properties
The following three sections present new approaches to model the spatial behavior of the robot's • natural frequencies (Section 2.3.1), • damping ratios (Section 2.3.2) and • residues (Section 2.3.3).

Primary Feature: Natural Frequencies
As presented in Section 1, the concept of multi-fidelity information fusion shall be transferred to modeling of the spatial behavior of the robot's natural frequencies: the lowfidelity information is provided by the rigid body model of the robot. Based on the model's mass and stiffness matrices M(q) and K, the position-dependent natural frequenciesω i are the eigenvalues of the system's eigenvalue problem [22]: Instead of a computer model with higher fidelity, the high-fidelity information is represented by the experimentally estimated natural frequencies of the robot's structure at different workspace positions (see Section 2.1).
In this respect, Figure 6 illustrates the spatial behavior of the measured natural frequencies in comparison to the estimated natural frequencies by a rigid body simulation.
For simplicity, the rigid body model is equivalent to the model published in [16]. The definition of the rigid body model and its parameters are given in the Appendix C (see Figure A2). The mass and inertia properties of the bodies were estimated using the CAD model of robot, similar to the approach of Reinl et al. [23]. The joint stiffness values were taken from the previous works by Rösch [7]. Additionally, illustrations of the first four mode shapes are given in the Appendix D as well (see Figure A3).  Figure 6. Spatial behavior of the first natural frequency as measured (column a)) and as predicted by the simulation (column b)). There is a linear dependency between the simulation results and the measurements (column c)).
Furthermore, the data is normalized before the training procedure. The hyperpa-262 rameters of each model are optimized during training using five optimization runs. 263 264 Figure 6. Spatial behavior of first natural frequency as measured (column a)) and predicted by simulation (column b)). There is a linear dependency between simulation results and measurements (column c)).
As shown, a rather linear relationship exists between the rigid body simulation (ω 1 ) and the experimentally estimated natural frequency (ω 1 ). However, depending on the accuracy of the rigid body model, this relationship is not necessarily linear.
In the following, two approaches to inferring the relationship between the low fidelity simulation and the high fidelity measurement data are used and compared: the LINEAR AUTO-REGRESSIVE SCHEME (AR1), which was proposed by Kennedy and O'Hagan [24] can estimate a linear relationship, whereas the NONLINEAR AUTO-REGRESSIVE MULTI-FIDELITY GP (NARGP) by Perdikaris et al. [25] is even able to model nonlinear relationships. Both approaches rely on Gaussian process regression techniques, which incorporate prior beliefs about the underlying system dynamics by choosing appropriate kernel functions. Furthermore, since Gaussian processes are Bayesian machine learning techniques, they provide reliable uncertainty estimates of their predictions.
To illustrate the strengths of the proposed approach, the two multi-fidelity algorithms are benchmarked against regular Gaussian process techniques using only the high fidelity measurement data (abbreviated below as HFGP). Following the kernel definitions of Rasmussen and Williams [26], different kernel design choices for a regular Gaussian process setup are evaluated: • a linear kernel: a linear kernel makes it possible to incorporate a linear spatial model using the variance σ: • a quadratic kernel: a quadratic kernel allows more flexibility than a simple linear kernel and is represented by • a cubic kernel: similar to the generation of a quadratic kernel, the idea of a polynomial kernel can be extended to a cubic kernel, given by • a radial basis function (RBF) kernel: the conventional RBF kernel is a very popular kernel, as no assumption on the data's structure is incorporated. However, an RBF kernel may be prone to overfitting. The kernel is defined as where σ is the variance and l is the lengthscale of the RBF kernel.
Furthermore, the data are normalized before the training procedure. The hyperparameters of each model are optimized during training using five optimization runs.
The three approaches (HFGP, AR1 and NARGP) are validated and benchmarked against each other in Section 3.1.

Secondary Feature: Damping Ratios
Numerous research publications relating to modeling of the spatial dynamic behavior of machining systems indicate that modeling the damping effects of robots and machine tools is inherently difficult: as shown by Nguyen et al., the damping ratios exhibit the least "correlation with position compared to natural frequencies and stiffness" [14]. Furthermore, a clear spatial behavior is not evident from the measurements as illustrated in Section 2.1. Hence, in the context of Gaussian process regression techniques, assuming no spatial behavior, the data distribution can simply be modeled by a constant bias kernel. Such a kernel is given by In the event that a spatial behavior of the damping ratios is to be modeled, the previously mentioned kernels K linear , K quadratic , K cubic and K RBF can be chosen. Therefore, the performance of a constant bias kernel and of the spatial modelling kernels K linear , K quadratic , K cubic and K RBF will be evaluated and discussed in Section 3.2.
The hyperparameters are derived during training. Analogously to the natural frequencies, the data are normalized before training and the training consists of five optimization runs.

Secondary Feature: Residues
To predict the dynamic response of the robot structure accurately, information on the response amplitudes, represented by the structure's mode shapes or residues, is also of interest. Both mode shapes and residues represent the same physical property of a dynamic vibration system. Their relationship can best be explained by considering the reconstruction of a frequency response function H d (jω) in direction d using the system's natural frequencies ω i and their corresponding damping decay rates s i and mode shapes Ψ i,d : with λ i are the complex poles of the system. The scaling of the mode shapes using the scaling factor Q i,d yields the residues R i,d : Hence, the residues represent the scaled vibration modes and can be derived directly from the LSCF algorithm during the modal parameter identification process (see Section 2.1).
Similar to the damping ratios, the spatial behavior of the robot's modal residues can also be modeled using conventional Gaussian process regression. However, since residues are generally complex-valued, each residue is modeled using two independent Gaussian processes. Again, the performance of the presented kernels K bias , K linear , K quadratic , K cubic and K RBF is compared in Section 3.3. Analogously to the natural frequencies and damping ratios, the hyperparameters are derived during training. The data are normalized before training and the training consists of five optimization runs.

Results
In the following three Sections 3.1-3.3, the modeling approaches for three modal parameters are individually validated. Then in Section 3.4, the performance of the presented spatial modeling approach is highlighted using the reconstruction capabilities of positionbased frequency response functions. Implementation details are given in Section 3.5 for future reproducibility of the methodology.

Primary Feature: Natural Frequencies
As presented in Section 2.2, the approach to validating the spatial modeling of the natural frequencies proceeds in two steps: first, the different algorithms are benchmarked against each other in Section 3.1.1. Second, the accuracies of the three best-performing algorithms are assessed in Section 3.1.2, based on a varying number of training data points.

Validity of the Approach
To benchmark the three presented algorithms against each other, the 35 measurements are split into 15 training data points and 20 testing data points using the optimized LHS algorithm (see Section 2.2).
The performance of the four kernels for the Gaussian process regression and the two multi-fidelity algorithms (see Section 2.3.1) are rated according to the achieved coefficient of determination (R 2 ) and the root-mean-square error (RMSE), based on each model's prediction of the testing data set. The results are illustrated separately in Figure 7 for each modeled natural frequency.

Validity of the approach 307
To benchmark the three presented algorithms against each other, the 35 measure-308 ments are split into 15 training data points and 20 testing data points using the optimized 309 LHS algorithm (see section 2.2). It is apparent that the spatial behavior of the natural frequencies cannot be modeled adequately using only the high-fidelity data in combination with a linear, quadratic, or cubic kernel: only low coefficients of determination are obtained, and the root-mean-square errors are large. In contrast, the conventional RBF kernel and the two multi-fidelity schemes yield comparably good results: the achieved coefficient of determination is mostly higher than 0.9, and their root-mean-square errors are significantly lower in comparison to that of the linear, quadratic, or cubic kernels.
Hence, all three modeling approaches (HFGP-RBF, AR1 and NARGP) can adequately capture the spatial behavior of the natural frequencies, provided enough training data are available throughout the workspace of interest.

Accuracy Using an Increasing Number of Training Data Points
The performance of the conventional Gaussian process with an RBF kernel, the linear AR1 model and the nonlinear NARGP model can be further benchmarked regarding the prediction accuracy with an iteratively increasing number of measurements. As described in Section 2.2, an LHS-based sampling strategy was used for 3 to 15 measurements. Again, the model performance is evaluated using the achieved coefficient of determination R 2 and the resulting root-mean-square error RMSE. The results are illustrated in Figure 8. For a better comparison, a dashed horizontal line indicates the threshold R 2 = 0.9.
It appears that both multi-fidelity information fusion algorithms perform notably better than the conventional Gaussian process with the RBF kernel, when only a small number of measurements is available.
However, the performance of the NARGP model is fluctuating, whereas the AR1 approaches R 2 = 1 and RMSE = 0 Hz more quickly than the NARGP model with an increasing number of training points N train . This finding is supported by the assumption of a rather linear relationship between the rigid body model and the experimental data as indicated in Section 2.3.1. Hence, the linear multi-fidelity scheme AR1 enables a more efficient estimation of the spatial behavior of the robot's eigenfrequencies, even when only a small number of training data points is available.
ing the prediction accuracy with an iteratively increasing number of measurements. As Lastly, since all models are based on Gaussian process regression techniques, they provide methods to evaluate the prediction uncertainty. To highlight the strength of the proposed multi-fidelity modeling scheme AR1, Figure 9 illustrates the model prediction and the 95 % confidence interval for the second natural frequency ω 2 for only N train = 3 measurements used for training (i.e., the sampling strategy B 3 ). The AR1 algorithm models the spatial behavior of ω 2 accurately and provides reliable uncertainty boundaries, whereas the predictions of the conventional Gaussian process using the RBF kernel are inaccurate and do not even provide reliable prediction intervals (the model predictions are overconfident).
of the proposed multi-fidelity modelling scheme AR1, Figure 9 illustrates the model 350 prediction and the 95 % confidence interval for the second natural frequency ω 2 for only 351 N train = 3 measurements used for training (i.e., the sampling strategy B 3 ). The AR1 352 algorithm models the spatial behavior of ω 2 accurately and provides reliable uncertainty  It is apparent that the spatial modeling approach of the damping ratios performs 366 significantly worse than the spatial modeling of the natural frequencies: the coefficient 367 Figure 9. Comparison of prediction accuracy on testing data of ω 2 using sampling strategy B 3 , including model's predictive uncertainty in form of the 95% prediction interval.

Secondary Feature: Damping Ratios
As explained in Section 2.3.2, regular Gaussian process models were set up to model the damping ratios with different kernel choices, namely K bias , K linear , K quadratic , K cubic and K RBF .
Analogously to the analysis described in Section 3.1.1, the accuracy of the model prediction at the 20 testing data points using the 15 training data points is illustrated in Figure 10 using the coefficient of determination R 2 and the root-mean-square error RMSE. N train = 3 measurements used for training (i.e., the sampling strategy B 3 ). The AR1 352 algorithm models the spatial behavior of ω 2 accurately and provides reliable uncertainty  It is apparent that the spatial modeling approach of the damping ratios performs 366 significantly worse than the spatial modeling of the natural frequencies: the coefficient 367 Figure 10. Benchmark between different kernel designs for modeling damping ratios.
It is apparent that the spatial modeling approach of the damping ratios performs significantly worse than the spatial modeling of the natural frequencies: the coefficient of determination indicates that none of the kernels can model the spatial behavior accurately, even though the model using the RBF kernel seems to capture the underlying spatial behavior best.
However, it is crucial to remember that the measured damping ratios themselves do not indicate a clear spatial behavior due to the inherent nonlinear damping dynamics, so every modeling approach will perform poorly with such noisy training data. Due to this fact, use of the RBF kernel is not advised, because the modeling performance can depend strongly on the measurement positions and the modeled modes. For example, the RBF kernel seems to model the spatial behavior of ξ 1 and ξ 2 significantly better than the other kernels, whereas the RMSE for ξ 3 and ξ 4 is similar for all five kernels. Therefore, use of the cubic kernel is recommended for modeling of the spatial behavior of the damping ratios to prevent overfitting and is better suited for uncertainty quantification than for precise spatial modeling.

Secondary Feature: Residues
Analogously to the kernel selection process for the damping ratios, the five kernels (K bias , K linear , K quadratic , K cubic and K RBF ) are benchmarked against each other to evaluate their performance in modeling the spatial behavior of the residues R i,d . As described in Section 2.3.3, each complex residue R i,d is modeled using two independent Gaussian processes, one for the real and one for the imaginary part. The resulting RMSEs for the real and imaginary parts of R i,xx and R i,yy are depicted in Figure 11. The resulting R 2 values for R i,xx and R i,yy are given in the Appendix E (see Figure A4).
The spatial behavior of the residues is best modeled using the RBF kernel, especially for the imaginary parts of the residues. It is assumed that the RBF kernel outperforms the other kernels, since its properties allow a more flexible approximation of the underlying system behavior and do not restrict the model behavior (unlike a linear kernel, for example). Hence, it is recommended to use an RBF kernel to model the spatial properties of the mode's residues. the cubic kernel is recommended for modeling of the spatial behavior of the damping 378 ratios to prevent overfitting and is better suited for uncertainty quantification than for 379 precise spatial modeling. Analogously to the kernel selection process for the damping ratios, the five kernels 383 (K bias , K linear , K quadratic , K cubic and K RBF ) are benchmarked against each other to evaluate 384 their performance in modeling the spatial behavior of the residues R i,d . As described in 385 section 2.3.3, each complex residue R i,d is modeled using two independent Gaussian 386 processes, one for the real and one for the imaginary part. The resulting RMSEs for the 387 real and imaginary parts of R i,xx and R i,yy are depicted in Figure 11. The resulting R 2 388 values for R i,xx and R i,yy are given in the appendix (see Figure A4).

389
It can be seen that the spatial behavior of the residues is best modeled using the 390 RBF kernel, especially for the imaginary parts of the residues. It is assumed that the 391 RBF kernel outperforms the other kernels, since its properties allow a more flexible 392 approximation of the underlying system behavior and do not restrict the model behavior 393 (unlike a linear kernel, for example). Hence, it is recommended to use an RBF kernel to 394 model the spatial properties of the mode's residues.  Figure 11. Benchmarks of different kernel designs for modeling the spatial behavior of R i,xx and R i,yy . Figure 11. Benchmarks of different kernel designs for modeling spatial behavior of R i,xx and R i,yy .

Reconstruction of Frequency Response Functions
As presented in Section 2 and validated in Sections 3.1-3.3, the spatial properties of the robot dynamics, namely its natural frequencies, its damping ratios and its mode shapes, are modeled using probabilistic machine learning techniques. The strengths of the presented methodology can be illustrated using the reconstruction of the probabilistic frequency response function from the probabilistically modeled modal parameters (see Figure 12).
To do so, a conventional quasi-Monte Carlo (QMC) algorithm was utilized to estimate the resulting uncertainty in the reconstructed frequency response functions based on the uncertainty in the model predictions for the modal parameters at x T = [1.175 m, 1.0 m, 1.24 m] T . This point was not part of the training data. Detailed information on Monte Carlo and quasi-Monte Carlo techniques can be found in [27]. The random sampling during the QMC computation follow the Saltelli scheme (see [28] for detailed information).
The sampling strategy B 5 was chosen, so only five measurements in the workspace were taken into account during training (see Figure 5). During the Monte Carlo simulation, the directional frequency response functions were calculated 10, 000 times, based on random samples from the predicted probability distributions of ω i , ξ i and R i,d . The modeling approach is able to take asymmetric frequency response behavior into account (e.g. H xy = H yx ), as the directional residuals are trained independently of each other.

Reconstruction of frequency response functions 396
As presented in section 2 and validated in sections 3.1, 3.2 and 3.3, the spatial 397 properties of the robot dynamics, namely its natural frequencies, its damping ratios and 398 its mode shapes, are modeled using probabilistic machine learning techniques. The 399 strengths of the presented methodology can be illustrated using the reconstruction of 400 the probabilistic frequency response function from the probabilistically modeled modal 401 parameters (see Figure 12). information on Monte Carlo and quasi-Monte Carlo techniques can be found in [27].

407
The random sampling during the QMC computation follow the Saltelli scheme (see [28] 408 for detailed information).

409
The sampling strategy B 5 was chosen, so only five measurements in the workspace 410 were taken into account during training (see Figure 5). During the Monte Carlo simula-411 tion, the directional frequency response functions were calculated 10000 times, based on 412 random samples from the predicted probability distributions of ω i , ξ i and R i,d .

413
It is worth pointing out that the modeling approach is able to take asymmetric

Implementation Details
All computations were performed in Matlab, Python, and C++. Table 1 summarizes the key software packages used. The model training computations and the QMC simulations were performed on a cloud server running with 10 cores (Intel(R) Xeon(R) Gold 6148 CPU @ 2.40 GHz) and 45 GB RAM under Ubuntu 18.04.

Discussion
The validation results show that the proposed modeling approach makes it possible to model the spatial behavior of the robot dynamics in the form of its modal properties. Since probabilistic machine learning techniques are used, the model prediction does not only provide the estimated prediction mean, but also a reliable prediction uncertainty. The reconstructed frequency response functions are in good agreement with the measurements.
The performance evaluation indicates that the information fusion approach to modeling the spatial properties of the robot's natural frequencies can significantly improve the model performance, in comparison with that of conventional Gaussian process regression techniques when only few measurements in the workspace are available. It is our intention to investigate optimal sampling techniques in the future: according to the human-in-the-loop-principle (also called Active Learning), an iterative algorithm should propose the optimal next measurement point within the workspace to reach satisfactory model performance with as few measurements as possible.
Furthermore, the presented methodology provides the fundamental methods needed to predict pose-dependant stability limits, for example by predicting pose-dependent stability lobes based on the pose-dependent robot dynamics. Such an application is particularly appealing if the model uncertainty is propagated to the stability prediction as well. Existing methods for propagating the uncertainty that arises during the cutting force coefficient identification can be extended by the presented modeling approach [33].
Hence, such a combined model can incorporate both the uncertainty of the robot's structural dynamics as well as the uncertainty of the cutting force models.
The model is based on the estimated modal properties and not directly on the measured frequency response functions. However, state-of-the-art EMA methods still require domain expert knowledge (e.g., selecting an appropriate model order or selecting suitable poles) [34]. Hence, inaccuracies during the manual modal parameter extraction will result in inaccuracies in the trained models. It is advisable that further research is carried out in the field of automated and robust modal parameter estimation to model the robot's dynamics as accurately as possible. This is especially important if the natural frequencies overlap. In this case, the distinction between the modes must be carried out carefully under consideration of the mode shapes. Such mode tracking algorithms can make use of the MAC criterion to quantify similarities of the mode shapes to track the spatial behavior within the workspace of the robot.
Additionally, the methodology was validated for measurement points in the horizontal x-y-plane with a constant TCP orientation, which allows the application for 2.5D machining operations in the horizontal plane. For more complex machining operations or applications in different heights, the workspace must be extended using measurement locations in the three-dimensional space or including data points which incorporate the robot's kinematic redundancy (i.e., a rotation of the TCP around the tool's rotation axis).

Conclusions
This paper presented a methodology for modeling the position-dependent dynamics of industrial robots. The methodology is based on data-driven methods for modeling the spatial behavior of the robot's modal parameters.
The findings of the presented study are as follows: • First, an information fusion approach to model the robot's position-dependent natural frequencies improves the prediction accuracy significantly in comparison to that of conventional Gaussian process regression techniques, especially in scenarios with only a very small number of training data points. In those cases, the prediction uncertainty of conventional Gaussian processes is unreliable, whereas the uncertainty estimation of the linear information fusion scheme is reliable. • Second, a detailed study was conducted to evaluate different kernel design choices for modeling the robot's damping ratios and mode residues using conventional Gaussian process regression methods. The data analysis of the position-dependent damping ratios motivates the use of cubic kernels, whereas an RBF kernel is best suited for modeling the residues.
• Third, the position-dependent models can be used to estimate the position-dependent directional dynamics of the robot accurately and quantify the combined model uncertainty using a Monte Carlo algorithm. Funding: We express our gratitude to the Bavarian Ministry of Economic Affairs, Regional Development, and Energy for the funding of our research. The methodology will be extended in the research project "ProKIRo" (grant number DIK-2104-0057 DIK0367/01).

Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.

Conflicts of Interest:
The authors declare that they have no conflict of interest. Table A1. Measurement locations (given in the robot's base coordinate system).

Appendix A. Measurement Locations
x in m y in m z in m