Gaussian Process Modeling of Specular Multipath Components

: The consideration of ultra-wideband (UWB) and mm-wave signals allows for a channel description decomposed into specular multipath components (SMCs) and dense/diffuse multipath. In this paper, the amplitude and phase of SMCs are studied. Gaussian Process regression (GPR) is used as a tool to analyze and predict the SMC amplitudes and phases based on a measured training data set. In this regard, the dependency of the amplitude (and phase) on the angle-of-arrival/angle-of-departure of a multipath component is analyzed, which accounts for the incident angle and incident position of the signal at a reﬂecting surface—and thus for the reﬂection characteristics of the building material—and for the antenna gain patterns. The GPR model describes the similarities between different data points. Based on its model parameters and the training data, the amplitudes of SMCs are predicted at receiver positions that have not been measured in the experiment. The method can be used to predict a UWB channel impulse response at an arbitrary position in the environment.


Introduction
Future 5G wireless communication technologies and the Internet of Things (IoT) paradigm will be characterized by supporting a variety of services with high quality requirements, addressing performance metrics such as reliability, latency, data throughput, and resource-efficient use of the infrastructure [1][2][3]. Spatial location information is expected to become an indispensable feature of these emerging wireless networks, considering that much more accurate position information will become available with those new generations of wireless networks [4][5][6]. These references elaborate, how such "location-awareness" can be exploited for improving the robustness and performance of wireless networks. Location-awareness is founded in the observation that many parameters of the propagation channel are directly linked with the geometry of the scenario and environment, starting from the distance-dependent path loss, spatial correlation of shadowing, to the arrival angle of individual multipath components. The availability of such geometric models thus allows for efficient and robust location-based routing and medium access schemes, or geometry-based beam steering, to name just a few popular examples.
The prerequisites for implementing location awareness are accurate position estimation and the mapping of the propagation conditions, both of which must go hand in hand. The well-known simultaneous localization and mapping (SLAM) algorithms can be used to build a geometric model from past received signals, while locating the mobile user [7][8][9]. Note that these references are based on the use of wireless signals, while SLAM was originally developed by the robotics community for efficiently fusing the data from optical and odometry sensors. By exploiting the position-related information acquired from the multipath channels, localization and mapping information is obtained. The obtained localization and mapping errors were continuously reduced from meters to centimeters, while moving from conventional wideband transceivers, through ultra-wideband (UWB), to mm-wave systems [7][8][9][10][11]. The robustness of the acquired location information strongly depends on the surrounding environment and hence on multipath propagation [12][13][14].
Strongly related to those works are papers on radio positioning systems that exploit map information. For instance, low-level descriptors of the channel, such as the RSS, the time-of-flight (ToF), the angle-of-arrival (AoA) are mapped to a location using fixed geometry models and triangulation/trilateration methods [15,16]. In Reference [17], delay information of specular multipath components (SMCs) is used to enhance the robustness and reduce the infrastructure demand of a UWB localization system, see also Reference [12]. The concept extends straigthforwardly towards mm-wave systems, which will also make use of angle information [18].
These works indicate a strong link between map and position information. Specifically, the arrival time, arrival angles, and departure angles of so-called "specular multipath components" are all directly related to simple geometric models. It is thus common to separate these SMCs and use geometry-based models to describe their parameters. A popular approach is the use of clusters of scatter points for the modeling, characterized by a delay spread and angle spread [19][20][21], allowing to quantify the second-order statistic of geometry-related multipath components, that is, SMCs. Vice-versa, SMCs allow for a prediction of these channel parameters, given that the geometric models are available.
In contrast to this delay and angle modeling by means of SMCs, the modeling of the amplitudes of the SMCs is much harder to achieve. This is because many factors need to be considered, beginning from free-space path loss, directional beampatterns of antennas, reflection characteristics of building materials, to the polarization state of the waves and the mutual coupling of the antennas with nearby objects. Ray tracing simulations make an attempt to solving this challenge [22][23][24]. However, ultimately, this approach needs exact models of the building materials which makes it infeasible for online use in channel prediction.
In this paper, we use a data-driven approach to solve this issue. A mirror-source model is used to describe the arrival times and arrival/departure angles of the SMCs, which can be obtained by using a SLAM algorithm as in References [8,9]. We then apply Gaussian process regression (GPR) as a tool to model the amplitudes of the SMCs. That is, GPR is used to learn the joint effect of the antenna patterns and the material properties of the environment. Our prime aim is on validating this concept based on experimental data. In order to simplify the involved mathematics as much as possible, we select a scenario with a constant polarization state, no antenna coupling, and we restrict the modeling to horizontal characteristics, that is, we simplify the approach to a 2D geometric model. This reduces the amplitude model to a model that depends only on the angle of arrival (or departure) of the SMCs in the horizontal plane. An accurate amplitude model of SMCs leads to the possibility of predicting the amplitude of SMCs very accurately. The potential use could be in performance prediction, where an accurate amplitude model of SMCs may be related to the channel capacity at a certain receiver position [25]. In References [11,26], it was shown that SLAM using position-related multipath parameters (delay and AoA) can be significantly improved in robustness if the multipath components amplitude information is also used. The contributions of this paper are the following: • We derive a Gaussian process regression model for describing the spatial non-stationarity of SMC amplitudes in indoor environments. • We validate the prediction capability of the GP based real measurements acquired in an indoor environment. • We show that the GP is capable of modeling the angle-dependencies of the SMC amplitudes.
The remainder of the paper is structured as follows: In Section 2 we introduce the system and signal models. Section 3 presents the GP regression and the methods for quality evaluation. Section 4 reports numerical results using real measurements. Section 5 concludes the paper.

System and Signal Models
We consider indoor environments where fixed anchors communicate with a mobile agent by means of radio signals. Figure 1 illustrates the considered scenario, where two physical anchor positions are shown as blue crosses at positions a (1) 1 and a (2) 1 and the agent positions p are selected along a segmented trajectory. Furthermore, some exemplary virtual anchors (VAs) are shown. The VA positions are mirror images of the physical anchor positions that are induced by reflections at flat surfaces-typically walls-and thus depend on the surrounding environment (the floor plan) [27].
The position of the k-th VA of the jth physical anchor at position a (j) 1 is denoted as a (j) k . Note that in this work we consider horizontal propagation only. For brevity, we neglect the anchor index j from now on. Also, we denote L as the set of measurement points.

Signal Model
A baseband signal s(t) is transmitted at carrier frequency f c . The complex envelope of the received signal then reads where the first term describes the superposition of K specular multipath components (SMCs), the second term describes the convolution of s(t) and the diffuse/dense multipath component (DMC) ν(τ; p) and the third term is the measurement noise w(t) modeled as AWGN with a two-sided power spectral density of N 0 . In order to minimize inter-symbol interference, root raised cosine (RRC) pulses are considered for the transmitted signal, with pulse width T s . The energy of s(t) is normalized to one, that is, |s(t)| 2 dt = 1.
The SMCs: The kth SMC is characterized by delay τ k (p) and complex amplitude α k (p). The delay is modeled deterministically as a function of the (varying) agent position p, that is, where c is the propagation speed. The complex amplitude α k (p) ∈ C is given as where A(φ k (p)) comprises the antenna gains of the TX and RX antennas and the path loss at a reference distance d 0 = 1 m. The antenna gains are parameterized by the direction angle φ k (p) = ∠(p − a k ) which is defined by the line connecting the agent position p and the (virtual) anchor position a k . We assume that the antenna gains are nearly constant over the observed frequency range and that the antennas are non-dispersive. Γ k (φ k (p)) describes the reflection coefficient of the (flat) surface corresponding to VA a k . For the line-of-sight (LOS) component, with no involved reflection, we set the coefficient Γ 1 (φ 1 (p)) = 1. Equation (2) can be decomposed into a distance-dependent factor and an angle-dependent factor. The former consists of the path-loss of a (reflected) SMC in free space and a distance-dependent phase shift. The latter describes the angle-dependence of the agent and physical anchor antenna patterns (shown in A(φ k (p))) and the dependence of the reflection coefficient Γ k (φ k (p)) on the incident angle. Since we assume that the agent and physical anchor are at the same height, and the orientations of the agent and physical anchor antennas are fixed throughout the measurement, the incident angle (at the reflecting surface) and the angles at the agent and anchor (used to compute antenna gains) can be represented by one angle φ k (p) = ∠(p − a k ) as shown in Figure 1. Thus, (2) can be rewritten as.
The DMC: For the diffuse component, we use a complex Gaussian process where we assume uncorrelated scattering [28] and hence describe ν(τ; p) by the auto-correlation function where E · is the expectation operator and S ν (τ; p) is a power delay profile (PDP), which effectively models the diffuse multipath.

Amplitude Estimation
For UWB transmission, it is reasonable to assume that the SMC signals s k (t) = s(t − τ k (p)) are well separated in delay domain and thus approximately orthogonal. The SMC amplitude can be estimated by projecting the received signal onto the corresponding shifted pulse [29], that is, The variance of the estimated amplitude as derived in Appendix A is given as where . The first term on the right-hand side is a Gaussian process corresponding to the DMC and the second term is measurement noise. The estimated amplitude is thus distributed according tô This equation indicates that the mean SMC amplitudes α k (p) depend deterministically on the agent position p. The same holds for the variance of the SMC amplitudes, being a result of the interfering diffuse multipath that is characterized by the PDP S ν (τ k ; p), cf. Equation (4).

GP Modeling of the SMC Amplitudes
From (3) it is seen that α k (p) has a well-defined distance dependence, while the angle-dependence depends on many factors that are hard to express in analytic form, for example, antenna radiation patters and reflection coefficients of building materials. We hence propose to use a GP regression model to describe the (angle-dependent part of) α k (p), that is, γ k (φ k (p)). A Gaussian Process (GP) model is a regression tool to analyze and predict data assuming that the data have a normal distribution. This is a loose requirement because for a large enough database, the data can often be assumed to be Gaussian distributed. Furthermore, it is evident from (8) that our data obeys a (complex) Gaussian distribution itself.
However, we will not apply the GPR model to the complex amplitudes directly, because phase coherence of SMCs may be hard to achieve, in particular, when the training data has been recorded at uncertain positions. We therefore separate the complex amplitudes into real-valued absolute values and phase values. Note from the model (8) that the absolute values will be Rician distributed [29]. This distribution is characterized by the Rician K-factor, for which in the case of K 1, it is possible to approximate the Rician distribution with a Gaussian distribution [30], and the same holds for the phase of complex Gaussian variable. The GPR model will hence be used to model the functions where n abs ν,k with variance σ abs 2 ν,k and n ph ν,k with variance σ ph 2 ν,k represent the DMC of the amplitude and of the phase, respectively.

Gaussian Process Regression
GP regression usually consists of two steps: analyzing data obtained from measurements and predicting data at other positions that have not been measured.

GP Model
Our goal is to model the angle-dependent SMC amplitude ψ(φ k (p)) and phase ζ(φ k (p)) using a Gaussian process model. For the amplitude we have where µ GP (φ k (p)): R 2 → R denotes the mean function and c GP (φ k (p), φ k (p )): R 2 × R 2 → R the covariance function (also called kernel). Equation (11) is used equivalently for the phase ζ(φ k (p)).
Covariance functions are a key component of GPR, since they provide the information of similarities between observations at different data points. Covariance functions are usually parameterized and these parameters, the mean function, and the noise are usually lumped together into a set called 'hyperparameters', which we will describe later in this section.
In this work, the mean value is modeled by the constant, that is, (and β ph k for the phase). A squared exponential kernel is used for the covariance, where σ abs k is the standard deviation of the correlation kernel, a abs k is the characteristic correlation angle, and σ abs ν,k is the standard deviation accounting the DMC as shown in (9)  for the phase). Note the abuse of notation in the latter term, which is due to the mapping of p ∈ R 2 to φ k (p) ∈ R. The Dirac function δ(p − p ) indicates that the DMC are assumed to be spatially uncorrelated. Here, we have a pair of agent positions (p, p ) mapped to one covariance value, that is, R 2 × R 2 → R. These models will be the basis for the GP modeling of the SMC amplitudes. Together with the data, the GP model reveals the angle-dependent functions ψ(φ k (p)) and ζ(φ k (p)) as well as a variance that relates to the DMC S ν (τ k ; p). The full hyper-parameter vectors of the GP model which have to be learned from the amplitude and phase data of SMC k are given as and respectively.

Prediction
We next review the generic principle of GP regression (GPR). For each SMC k, we have a training database D k = {φ k , y k } with i = 1 . . . N measurements y k,i = ψ(φ k (p i )) + (or y k,i = ζ(φ k (p i )) + ), recorded at training positions P k = {p 1 , p 2 , . . . , p N } that are mapped to angles φ k = [φ k (p 1 ), . . . , φ k (p N )] and ∼ N (0, σ 2 ) represents the measurement noise with variance σ 2 (which is connected to the power spectral density of the AWGN as described in Appendix B). The measurements are stacked into the training data vector y k = [y k,1 , . . . , y k,N ] T . After the hyperparameters are learned, the mean and variance can be predicted at test position p * with angle φ k (p * ). The mean and the variance of the angle-dependent amplitude ψ(φ k (p * )) (or phase ζ(φ k (p * ))) conditioned on the data D k and θ abs k (or θ ph k ) can be expressed in closed form [31] where V[·] is the variance operator and with and In Appendix B, the relation between the predicted variance and the PDP of the DMC and the power spectral density of the AWGN is analyzed in detail.

Learning
The GP prediction in Section 3.2 assumes the hyper-parameter vector θ abs k given in (14) (or θ ph k given in (15)) is known. If the hyper-parameter vector θ abs k is not known, it needs to be estimated from the training database D k = {φ k , y k } for each SMC. The joint probability density function of the observed measurements y k conditioned on the training angles φ k is given by a Gaussian distribution with mean and covariance both dependent on the hyper-parameter vector θ abs k (see for (12) and (13)), which we make explicit by writing µ θ abs k and K θ abs k , respectively. The parameter vector θ abs k (or θ ph k ) is estimated by maximizing the log-likelihood function, that is, In general, this maximum likelihood estimation cannot be done in closed-form. Therefore, numerical approximation needs to be used. Since the function may be highly non-convex,θ abs k (orθ phi k ) has to be found by a global search over the domains of a abs k , σ abs k , and σ abs ν,k (or a ph k , σ ph k , and σ ph ν,k ).

Evaluate the Quality of Prediction
Given a prediction method, we can evaluate the quality of prediction in several ways. Perhaps, the simplest is the squared error loss. However, this quantity is sensitive to the overall scale of the target values, so it makes sense to normalize by the variance of the targets of the test cases to obtain the standard mean squared error (SMSE) [31]. Additionally, for the GPR on the absolute values of SMC amplitudes, since we produce a predictive distribution at each test input, we can evaluate the negative log probability of the target under the model. This loss should be standardized and averaged to obtain the mean standard log loss (MSLL) [31].
In order to evaluate the quality, the data set will be divided into two sets, for training and testing. Besides the training data D k , the set of test data is given by D (16) of the angle-dependent amplitude ψ(φ k (p * i )) (or phase ζ(φ k (p * i ))) based on the learned parameters θ abs k (or θ ph k ) in (16) is given by [31] whereȳ * k,i = 1/N * ∑ N * i=1 y * k,i is the mean value of the measurements (similarly for the phase we have SMSE ph ). In (21), the numerator is the sum of the squared error between the measured test data y * k and the predicted mean. The denominator is the sum of squared error between the measured test data y * k and the mean value of the measurementsȳ * k,i . Thus, an SMSE significantly smaller than one indicates a high prediction quality.
Another measure of quality is the MSLL, which also considers the predicted variance, that is, V[ψ(φ k (p * i ))|D k , θ abs k ] in (17). It is calculated by averaging over the standard log loss of the amplitude ψ(φ k (p * i )) (or phase ζ(φ k (p * i ))) based on the learned parameters θ abs k (or θ ph k ) for each pair in D * k , that is, where Large negative values of the MSLL indicate a high prediction quality. Figure 1 shows the laboratory room at Graz University of Technology that was used for the experimental validation. The room consists of two plaster board walls and two reinforced concrete walls (shown as black outer lines), three glass windows at the north wall (shown as thick gray lines), one white board and one metal door at the south wall (indicated by A * and C * , respectively). We introduce the following labels to refer to the involved reflection surfaces:

Experimental Setup
EPB East plaster board. SW South wall. WW West wall. NGW North glass wall.
To conduct the channel measurements, an Ilmsens Ultra-Wide band M-sequence device [32] was used, c.f. Figure 2a. The measurement principle is correlative channel sounding [33], that is, a binary code sequence with suitable autocorrelation properties is transmitted over the channel. At the receiver, the channel impulse response is recovered using a correlation with the known code sequence. The channel sounder has one transmitter port and two receiver ports. A 12-bit M-sequence has been employed, corresponding to a sequence length of 4095 samples. This allows for an unambiguous delay window of 589.2 ns at a clock rate of 6.95 GHz. The M-sequence is modulated onto a 6.95 GHz carrier, yielding a probing signal that covers a frequency band between approx. 3.5 and 10.5 GHz. The measurement data are available online in the "MeasureMINT" data base. Please refer to: https: //www.spsc.tugraz.at/databases-and-tools/uwb-indoor-channel-experimental-data.html.
Each of the ports was connected to a dipole coin antenna as shown in Figure 2b. According to Reference [34], the coin antenna has a very wide bandwidth ranging from 3 to 9 GHz. It also has a nearly isotropic radiation pattern in the horizontal plane.
We used the two receiver ports as anchors and placed their antennas at fixed positions a (1) 1 and a (2) 1 . The transmitter port is connected to another antenna that was moved along a trajectory with 595 points p, as shown in the Figure 1, to obtain the same number of channel measurements. All antennas were mounted on tripods at the same height, therefore only the co-polarized, azimuth radiation pattern of the antenna has an impact on the data, c.f. Equation (2). The raw measurements at the receiver ports were filtered with an RRC pulse with center frequency 6.95 GHz, roll-off factor 0.5 and bandwidth 1/T s = 2 GHz to obtain the received signals corresponding to the model in (1).

Measurement Pre-Processing
The agent trajectory is divided into 7 segments shown with different colors. The spacing between trajectory points was 2 cm for segments 1 and 2, and 4 cm elsewhere. During the experiment, the exact positions of the trajectory points were unknown. Hence, in a pre-processing step, we used the SLAM algorithm [8,10] (2) , corresponding to single reflections on the EPB, SW and WW, respectively.

, indicated by index
Given the estimated positions of transmitters and receivers, the expected delays, that is, τ k (p)= 1 c a k − p are deduced. Then, the SMC amplitudesα k (p) are estimated from the received signals, using these delays in (5). The use of (5) for estimating the SMC amplitudes relies on the assumption that individual SMCs do not overlap, reducing the interfering multipath to the DMC ν k (p). We remove overlapping SMCs from our data set if two delays have |τ k (p) − τ l (p)| ≤ T p 2 , for k = l, yielding the subsets P k ⊆ L of trajectory points. Finally, the distance dependence will be removed from the SMC amplitudes to get the normalized amplitude data ψ(φ k (p)) and phase data ζ(φ k (p)) according to Equations (9) and (10). Note that the trajectory positions and VA positions are mapped to the direction angles φ k (p) = ∠(p − a k ) for the purpose of GPR.
Several datasets are hence deduced, which are denoted as D k = {φ k (p), y k (p)} p∈P k , containing the direction angles φ k (p) and the data points y k (p). The data points are either the absolute values ψ(φ k (p)) or the phases ζ(φ k (p)). The index k relates here to the different SMCs and the two anchors. Specifically, for Anchor 1, we analyze data sets corresponding to the LOS, the EPB, and the NGW, for Anchor 2, we analyze the EPB and the SW.
On the basis of these data sets, the GP model is learned using the Matlab function fitrgp (Matlab version R2018a; Statistics and Machine Learning Toolbox). Prediction is done using the function predict. The regression result will be discussed to see if it is sensible to use GP regression to model the SMC amplitudes. We also evaluate the GPR performance numerically with the SMSE and the MSLL in Section 3.4. Table 1 lists the learned hyperparameters θ abs k (or θ ph k ) obtained from the measurements. It is shown that the characteristic correlation angle of the LOS GP amplitude has the highest angle correlation, that is, the estimated LOS amplitude shows a small variation over a wide range of angles. This might come from two factors, (i) the influence of DMC on the LOS amplitudes is rather small shown by the DMC standard deviation σ abs ν,k in Table 1, and (ii) the estimated LOS amplitudes only depend on the antenna pattern A(φ k (p)), while the reflection coefficient Γ 1 (φ 1 (p)) is defined to be one. The characteristic correlation angles of the SMC GP amplitudes are significantly smaller, that is, the estimated SMC amplitudes have a much larger variation over the range of angles. This is a strong indicator that the SMC amplitudes show an angle-dependent reflection coefficient Γ k (φ k (p)) and the influence of DMC is larger, which is in agreement with increased DMC standard deviation σ abs ν,k in Table 1. All these observations are supported by results shown in the figures below. In the following, all estimated and modeled SMC amplitudes are normalized as defined by Equation (9).  Figure 3 shows the regression results for the LOS amplitude associated with Anchor 1 and some SMCs associated with Anchors 1 and 2. Particularly, it shows the estimated SMC amplitudes (colored markers) for the individual segments, the predicted mean given in (16) (red solid line), and the predicted standard deviation given in (17) (black dashed and blue dash-dotted lines) as a function of the angle of departure φ 2 (p). The deviation is shown as ±2σ, representing the upper and lower limits that contain 95% of the data points.

GPR of SMC Amplitudes
In Figure 3a, the LOS amplitudes are shown. It can be observed that the predicted mean value of the LOS GP amplitude only slowly changes with the angle φ 1 (p) according to the antenna pattern A(φ 1 (p)). The predicted standard deviation is rather small since the learned DMC standard deviation and the characteristic correlation angle are large.    3b-e show clearly that the predicted mean values of the SMC GP amplitudes vary significantly with the angle φ k (p). Those variations originate mostly from the angle-dependency of the reflection coefficient Γ k (φ k (p)), because the angle-dependency of the antenna pattern A(φ k (p)) is slow over the entire observation angle. As expected, the predicted standard deviation is much larger than the one for the LOS amplitude. (A more detailed verification of the variance regression performance is found in Section 4.3.2.) For the SW (at a (2) 3 ) and the NGW (at a (1) 5 ), the effect of varying reflection coefficient due to different building materials can be seen; c.f. Figure 3d,e, respectively. In Figure 3d, sections A * , C * and B * correspond to the whiteboard, the metal door and a small plasterboard section in between, c.f. Figure 1. It can be seen that, for angles between −260 to −230 degrees, the fluctuations are small because the SMC relates to the strong reflection by the metallic whiteboard. For angles above −230 degrees, the amplitude fluctuates severely, because the point-of-reflection moves to sections B * and C * comprising of other materials. Also note that the variance of the data points fluctuates. In general, data from positions which are quite far away from the VA a (2) 3 have larger variances. This is partly the result of a more significant impact of the DMC on those data but it is also the effect of multiplying the DMC with the longer travel distance to obtain the normalized SMC amplitude. In Figure 3e, the amplitude fluctuations due to varying reflection materials can be seen again. The various materials are noted accordingly, whereas sections A, C and E are glass windows. Even with the same material, separate angle ranges are covered where the normalized amplitude, and thus the reflection coefficient, is significantly different, for example, for lower angles, an increased coefficient can be seen.

Predictability
In order to test the ability to predict the SMC amplitudes using GPR, we divide the data set into two subsets, for training and testing. It is desirable for the training data to cover the largest possible range of angles. Therefore, we choose segments 4, 5, 6 for training, and 1, 2, 3, 7 for testing (c.f. Figure 1). 3 ), respectively. Again, the red solid bold line depicts the predicted mean, the black and blue solid lines show the upper and lower ±2σ limits (comprising 95% of the data points), plotted over the positions along the trajectory. It is shown that the predicted mean of the SMC amplitudes coincides very well with variations of the estimated SMC amplitudes. Moreover, the predicted upper and lower limits contain most of the SMC amplitudes obtained from measurements for the test data. Table 2 shows the prediction quality evaluated with the SMSE and MSLL as defined in Section 3.4. The predictability for the LOS amplitude is clearly very reliable, indicated by the small SMSE value and large negative MSLL values. The predictions of the SMC amplitudes for EPB, NGW and SW are still reasonable good in comparison to the LOS amplitude given that the SMSE values are all smaller than 1 and MSLL values are all negative. However, the amplitude prediction for EPB outperforms the predictions for the NGW and SW. A possible explanation might be that the DMC standard deviation is smaller (see for

Variance Verification
In this section, we will validate the capability of the GPR to model the power of the DMC by means of the predicted variance, cf. Equation (17). Based on the estimated SMC of VA at a (1) 2 associated with anchor 1 (east plaster board wall), the empirical mean and the empirical standard deviation are calculated over a sliding window along the trajectory positions, that is, and where m is the index of the agent position and N win is the sliding window size, which is chosen to be N win = 10. Figure 5a,b compare the empirical mean in (23) and the empirical standard deviation in (24) with the predicted mean (given by (16)) and the predicted standard deviation (square-root of (17)) of the GP along the agent trajectory. In Figure 5a, the blue solid line shows the estimated SMC amplitudes, the red solid line shows the empirical mean µ meas,m and the red dashed lines the empirical standard deviation (±2σ meas,m ) of the estimated amplitudes. In Figure 5b, the blue solid line shows the estimated SMC amplitudes, the green solid line shows the predicted mean and the green dashed lines the predicted standard deviation (±2σ GP ) of the SMC GP amplitude. Both, the mean and the standard deviation are in a good agreement. However, the predicted standard deviation (variance) in (17) has constant values for the DMC standard deviation (see Table 1), independent of the measurements used for prediction, and only changes deterministically with the angle via the first term of the correlation kernel in (13).

GPR of SMC Phases
To show the applicability of the GPR for phase prediction of the SMC phases, we exemplaryly analyze the estimated SMC amplitudes related to EPB and NGW. Figure 6 shows the results, which indicates the phase coherence of the SMCs. The respective hyperparameters obtained from the GPR are also shown in Table 1. From this Table, it is shown that the phase of the NGW component has the largest angle dependence, given that its characteristic correlation angle is the smallest.

Predictability
Similar to Section 4.3.1, we separate the measurements into a training set containing data from Segments 4, 5, 6 and a test set containing data from Segments 1, 2, 3, and 7. Figure 7 shows the prediction results. It can be seen that the prediction of the mean and variance is reasonably good. and NGW a (5) than the one of a trivial method. Similarly, the MSLLs are all negative, also showing good prediction quality. This evaluation indicates that a consideration of the carrier phase may become possible for multipath-based tracking filters, if GPR is used for modeling the phase shifts of the SMCs.

Conclusions
We demonstrate the applicability of Gaussian process regression (GPR) for the large-scale amplitude and phase modeling of specular multipath components (SMCs). UWB or mm-wave radio systems provide sufficient temporal and/or angular resolution to resolve such SMCs at the receiver side. The GPR model describes the amplitude and phase fluctuations of SMCs due to changing reflection characteristics of building materials and due to the angle-dependent gain of the TX and RX antennas. Our experimental results-performed with UWB channel measurements-show that the mean amplitude and phase information of SMCs can be predicted reliably, while some uncertainty results from the dense multipath component (DMC) that interferes with the SMCs. The variance predicted by the GPR model reflects the impact of this DMC.
Large-scale predictability of the radio channel is achieved through the combination of the GPR model (to describe the SMC amplitudes and phase shifts) and a geometric model of the SMCs to describe the arrival times and angles. That is, based on a set of training data and hyperparameters of the models learned from it, these core features of the channel impulse response can be predicted at any point in the surveyed environment. We obtain a new level of environment-awareness, useful to predict performance characteristics of wireless networks over large spatial scales.