Explainable Deep-Learning-Based Path Loss Prediction from Path Proﬁles in Urban Environments

: This paper applies a deep learning approach to model the mechanism of path loss based on the path proﬁle in urban propagation environments for 5G cellular communication systems. The proposed method combines the log-distance path loss model for line-of-sight propagation scenarios and a deep-learning-based model for non-line-of-sight cases. Simulation results show that the proposed path loss model outperforms the conventional models when operating in the 3.5 GHz frequency band. The standard deviation of prediction error was reduced by 34% when compared to the conventional models. To explain the internal behavior of the proposed deep-learning-based model, which is a black box in nature, eight relevant features were selected to model the path loss based on a linear regression approach. Simulation results show that the accuracy of the explanatory model reached 72% when it was used to explain the proposed deep learning model. Furthermore, the proposed deep learning model was also evaluated in a non-standalone 5G New Radio network in the urban environment of Taipei City. The real-world measurements show that the standard deviation of prediction error can be reduced by 30–43% when compared to the conventional models. In addition, the transparency of the proposed deep learning model reached 63% in the realistic 5G network.


Introduction
The emerging fifth-generation (5G) mobile communication systems are expected to bring a complete revolution in applications and experiences [1]. Mobile operators worldwide are racing to roll out 5G services. However, among the technical challenges, accurate channel models to predict path loss are vital for the design of 5G cellular communication systems.
Many path loss models for wireless communication networks have been studied by prior researchers. Popoola et al. presented extensive driving test measurement data of radio wave propagation in a campus environment and analyzed the correlation between the path loss and terrain information [2]. For 5G networks, standardization organizations, such as the International Telecommunications Union Radiocommunication Sector, have conducted channel measurements in urban and suburban environments for path loss modeling [3]. Shabbir et al. presented the comparison of 5G path loss models based on simulation results and evaluated the suitability of the models in different scenarios [4]. Sun et al. presented the comparison of the alpha-beta-gamma (ABG) and close-in (CI) path loss models using measured data and ray-tracing techniques, with carrier frequencies ranging from 2 GHz to 73 GHz [5]. The authors found that the physically based CI model yielded smaller prediction errors, with a standard deviation ranging from about 6 dB to12 dB [5].
To improve the ABG model, a weighted ABG model was proposed in [6], which suitably integrated or combined different available datasets and obtained better results in terms of model accuracy. It was reported that the weighted ABG model could obtain better results, with a standard deviation of prediction errors ranging from 1.2 dB to 12.5 dB [6]. To predict the millimeter-wave path loss, probabilistic models were presented in [7]. Based on real-world 28 GHz and 73 GHz measurements, the authors obtained the line-of-sight (LOS) probabilities from ray-tracing techniques and proposed a hybrid path loss model, which was a weighted sum of LOS and non-line-of-sight (NLOS) propagation loss [7].

•
A deep-learning-based path loss model utilizing path profiles is presented for 5G mobile communication systems in urban propagation environments; • The internal behavior of the proposed deep learning model is explored using an explainable linear model, which considers some selected geometric features of the path profile; • Simulation results as well as field measurements in a 5G New Radio (NR) network are shown.
The rest of this paper is organized as follows. Section 2 reviews some conventional path loss models. Section 3 presents a detailed description of the proposed method. Subsequently, Section 4 explores the prediction performance based on numerical simulations. Section 5 examines the explainability of the proposed model. Section 6 evaluates the prediction performance in a real-world 5G system in urban Taipei City. Finally, Section 7 presents some concluding remarks. Section 5 examines the explainability of the proposed model diction performance in a real-world 5G system in urban T presents some concluding remarks.

Walfisch-Ikegami Model
The Walfisch-Ikegami model computes the path loss b NLOS conditions. The path loss in decibels is given by where is the free space path loss, is the roof-top-to loss, and is the multiscreen diffraction loss [16,17]. The of the carrier frequency. In addition, they relate to some ge stance, relates to the range between the transmitter (Tx relates to the average street width, the road orientation ang height; and relates to the average building separation, building level to the Tx antenna, and a factor associated wi the operating frequency for the Walfisch-Ikegami model ran although most of the 5G frequency bands are out of this rang

ABG Model
The ABG model is a simple extension to the alpha-beta Generation Partnership Project (3GPP) by adding a frequen It is also one of the standard 3GPP models and is currently w [6]. The ABG model can be expressed as [5]: 10 log 10 log

Walfisch-Ikegami Model
The Walfisch-Ikegami model computes the path loss by considering both LOS and NLOS conditions. The path loss in decibels is given by where L FS is the free space path loss, L rts is the roof-top-to-street diffraction and scatter loss, and L msd is the multiscreen diffraction loss [16,17]. These loss terms are all functions of the carrier frequency. In addition, they relate to some geometric parameters. For instance, L FS relates to the range between the transmitter (Tx) and receiver (Rx) pair; L rts relates to the average street width, the road orientation angle, and the average building height; and L msd relates to the average building separation, the height from the average building level to the Tx antenna, and a factor associated with diffraction loss. Note that the operating frequency for the Walfisch-Ikegami model ranges from 800 MHz to 2 GHz, although most of the 5G frequency bands are out of this range.

ABG Model
The ABG model is a simple extension to the alpha-beta (AB) model used in the 3rd Generation Partnership Project (3GPP) by adding a frequency-dependent parameter [5]. It is also one of the standard 3GPP models and is currently widely used in 5G applications [6]. The ABG model can be expressed as [5]: where α is the path loss exponent, β is an offset term, and γ represents the dependence of path loss on log-frequency. The optimal model parameters are typically achieved by performing linear regression with the measurement data [18].

CI Model
The CI free-space reference distance model with a 1 m reference distance is given by [5]: where d 0 is the close-in free-space reference distance, and L FS ( f c , d 0 ) is the free-space path loss in decibels at the carrier frequency f c and at a Tx-Rx separation distance of d 0 .
Additionally, the optimal model parameters are typically estimated by performing linear regression with measurement data [18].

Problem Formulation and the Proposed Path Loss Model
In urban environments, the direct connection line between the Tx and Rx is very likely to be blocked by buildings. Figure 2 shows the conceptual overview of the Tx-Rx pair and the path profile along the direct connection line, where h Tx is the Tx height and D is the horizontal distance. The obstruction of buildings results in excessive path loss, in addition to free space propagation. Therefore, this paper incorporates the path profile in path loss modeling.

Problem Formulation and the Proposed Path Loss Model
In urban environments, the direct connection line between the Tx likely to be blocked by buildings. Figure 2 shows the conceptual overvi pair and the path profile along the direct connection line, where ℎ is th is the horizontal distance. The obstruction of buildings results in exces addition to free space propagation. Therefore, this paper incorporates th path loss modeling. The proposed model combines LOS and NLOS propagation scenario where the radiation from the Tx is not blocked by buildings, the path lo using the log-distance path loss model (2). For NLOS cases, this paper a excessive path loss in addition to the free space propagation is caused by of buildings. Figure 3 shows the proposed deep-learning-based path loss m data include the Tx height, ℎ , the distance between the Tx-Rx pair, profile. Assuming the cell radius is , the path profile is represented as a , with a sample distance of 1 m. The values in the profile vector stand height along the path. As a result, the length of the input layer for the n 2. Note that ≤ , given that the Rx is within the coverage of the T layer, the proposed model requires only one neuron, which is used to loss. In between the input and output layers, three hidden layers, namely are used to extract essential features that cause path loss.
Given the kth Tx-Rx pair, the input to the proposed deep neural n sented as: where , , for 1 ≤ ≤ , is the building height of the kth path profile. T be normalized before being fed into the network because normalization to be able to accelerate the training process and improve the performan works [19]. The normalized inputs are denoted as and , , , as the weights to be optimized at the hidden layers L , L and L , resp mensions of and , for 1, 2, 3, are the hyper-parameters and ca and 1, respectively. Note that is the length of the inpu The proposed model combines LOS and NLOS propagation scenarios. For LOS cases, where the radiation from the Tx is not blocked by buildings, the path loss is determined using the log-distance path loss model (2). For NLOS cases, this paper assumed that the excessive path loss in addition to the free space propagation is caused by the obstruction of buildings. Figure 3 shows the proposed deep-learning-based path loss model. The input data include the Tx height, h Tx , the distance between the Tx-Rx pair, D, and the path profile. Assuming the cell radius is Q, the path profile is represented as a vector of length Q, with a sample distance of 1 m. The values in the profile vector stand for the building height along the path. As a result, the length of the input layer for the neural network is Q + 2. Note that D ≤ Q, given that the Rx is within the coverage of the Tx. In the output layer, the proposed model requires only one neuron, which is used to predict the path loss. In between the input and output layers, three hidden layers, namely, L1, L2, and L3, are used to extract essential features that cause path loss.
is a row vector, is a scaler, and is the sigmoid activation func and are also the network weights to be optimized. A training dataset of in to find the optimal weights in a supervised learning manner. The goal of op to minimize the error provided by the mean squared error (MSE) function network output and the ground truth in the training dataset. The MSE func by: where is the path loss in the training dataset and is the total number o Minimizing the above cost function can be performed iteratively with the use backpropagation algorithm [20]. In this study, the well-known Adam op adopted. In addition, mini-batch training was used to compute the gradien function with respect to the weights. To combat overfitting, dropout layers w posed. Table 1 summarizes the hyper-parameters of the proposed path loss purpose of this study is to predict path loss in urban environments; therefore dius, , was assumed to be 500 m. By trial and error, the numbers of neurons i layers were selected as 502, 128 and 8.   Given the kth Tx-Rx pair, the input to the proposed deep neural network is represented as: where q i,k , for 1 ≤ i ≤ Q, is the building height of the kth path profile. The input should be normalized before being fed into the network because normalization has been proven to be able to accelerate the training process and improve the performance of neural networks [19]. The normalized inputs are denoted asx k and (θ 1 , b 1 ), (θ 2 , b 2 ), and (θ 3 , b 3 ) as the weights to be optimized at the hidden layers L 1 , L 2 and L 3 , respectively. The dimensions of θ i and b i , for i = 1, 2, 3, are the hyper-parameters and can be denoted as N i × N i−1 and N i × 1, respectively. Note that N 0 is the length of the input vector, i.e., Q + 2, and N 1 , N 2 and N 3 are typically tuned manually by trial and error. The features extracted by the three hidden layers can be expressed in a vector form as: , and ϕ 1 , ϕ 2 and ϕ 3 are the activation functions for the hidden layers. In this study, the rectified linear Unit (ReLU) activation function was adopted. Subsequently, the extracted features were used to predict the path loss based on non-linear regression. The resulting path loss,ŷ k , is given by: where ϑ is a row vector, b is a scaler, and φ is the sigmoid activation function. Both ϑ and b are also the network weights to be optimized. A training dataset of inputs is used to find the optimal weights in a supervised learning manner. The goal of optimization is to minimize the error provided by the mean squared error (MSE) function between the network output and the ground truth in the training dataset. The MSE function is given by: where y k is the path loss in the training dataset and M is the total number of recorders.
Minimizing the above cost function can be performed iteratively with the use of the error backpropagation algorithm [20]. In this study, the well-known Adam optimizer was adopted. In addition, mini-batch training was used to compute the gradient of the cost function with respect to the weights. To combat overfitting, dropout layers were also imposed. Table 1 summarizes the hyper-parameters of the proposed path loss model. The purpose of this study is to predict path loss in urban environments; therefore, the cell radius, Q, was assumed to be 500 m. By trial and error, the numbers of neurons in the hidden layers were selected as N 1 = 502, N 2 = 128 and N 3 = 8.

Analysis of Simulation Results
A software package, SignalPro ® , by EDX Wireless, Inc., was used to conduct the simulations. It provides a set of simulation tools for wireless communication systems. To facilitate simulations with site-specific geographic information, SignalPro ® supports the import of digital maps. In this study, a 3D map of urban Taipei City was imported to the software package. Figure 4 shows the top view of the simulation environment, which covered an area of 900 × 900 m 2 . The polygons in the figure represent buildings with different heights. The means and standard deviations of building heights were 15.7 m and 8.7 m, respectively. The maximum and minimum building heights were 46 m and 4 m, respectively. Seven base stations with omnidirectional antennae were deployed and are designated by encircled crosses (⊕) in Figure 4. The center of the simulation environment was latitude 25.057457 • N and longitude 121.534808 • E. Readers can use Google Maps or OpenStreetMap to see more details about the regional information.
Appl. Sci. 2021, 11, x FOR PEER REVIEW respectively. The maximum and minimum building heights were 46 m and 4 tively. Seven base stations with omnidirectional antennae were deployed an nated by encircled crosses (○ + ) in Figure 4. The center of the simulation enviro latitude 25.057457° N and longitude 121.534808° E. Readers can use Goog OpenStreetMap to see more details about the regional information.
In the simulations, the effective isotropic radiated power and the height stations were set to 50 dBm and 30 m, respectively. Receivers with a height of placed on the roads. Distances between the Tx and Rx pairs ranged from 10 The ray-tracing model was applied for the path loss calculations at the 3.5 GH band. As a result, the received power levels at all the Rx locations were record At the cost of high computational complexity, the performance of the model was satisfactory when compared to the field measurements [21,22]. In the simulation results using the ray-tracing model were taken as the ground t path loss predictions. In the simulations, 22,198 records were collected, where contained the receiving power and the path profile of the specific Tx-Rx pair. ords were partitioned into training and test datasets, with an 80/20 split.
Using the test dataset, Figure 5 shows the prediction performance of th path loss model and the conventional models, i.e., the ABG model and the CI m that the Walfisch-Ikegami model is excluded here because the operation freq GHz is out of its frequency range. In the figure, black dots are the ground-t whereas red dots are the predicted values by using the proposed model. The c models exhibited good matches with the mean values of path loss, while th between the predictions and the ground-truth values are large. In the simulations, the effective isotropic radiated power and the height of the base stations were set to 50 dBm and 30 m, respectively. Receivers with a height of 1.5 m were placed on the roads. Distances between the Tx and Rx pairs ranged from 10 m to 500 m. The ray-tracing model was applied for the path loss calculations at the 3.5 GHz frequency band. As a result, the received power levels at all the Rx locations were recorded.
At the cost of high computational complexity, the performance of the ray-tracing model was satisfactory when compared to the field measurements [21,22]. In this work, the simulation results using the ray-tracing model were taken as the ground truth for the path loss predictions. In the simulations, 22,198 records were collected, where each record contained the receiving power and the path profile of the specific Tx-Rx pair. All the records were partitioned into training and test datasets, with an 80/20 split.
Using the test dataset, Figure 5 shows the prediction performance of the proposed path loss model and the conventional models, i.e., the ABG model and the CI model. Defining prediction error as the difference between the predicted value an ground-truth value, i.e., = − , 1 ≤ ≤ , Table 2 shows the statistics of the pr tion errors. In terms of mean error, all the models showed good performance with slight variances. In terms of the standard deviation, however, the proposed mode performed the ABG and the CI models by reductions of about 4.71 dB and 4.84 dB, re tively. Furthermore, Figure 6 shows the cumulative distribution of the absolute err path loss prediction, | |. The proposed model outperformed the conventional mode the 67th percentile, it reduced the error by about 41.6% from 12.29 dB to 7.17 dB w compared to the ABG model, and reduced it by about 43.6% from 12.71 dB to 7.1 when compared to the CI model.  Defining prediction error as the difference between the predicted value and the groundtruth value, i.e., e i =ŷ i − y i , 1 ≤ i ≤ M, Table 2 shows the statistics of the prediction errors. In terms of mean error, all the models showed good performance with only slight variances. In terms of the standard deviation, however, the proposed model outperformed the ABG and the CI models by reductions of about 4.71 dB and 4.84 dB, respectively. Furthermore, Figure 6 shows the cumulative distribution of the absolute error of path loss prediction, |e i |. The proposed model outperformed the conventional models. At the 67th percentile, it reduced the error by about 41.6% from 12.29 dB to 7.17 dB when compared to the ABG model, and reduced it by about 43.6% from 12.71 dB to 7.17 dB when compared to the CI model.

Explainability of the Proposed Model
To obtain insight into the internal mechanism of the proposed deep-learning-b path loss model, particular inputs were selected to explore the behavior of the deep n

Explainability of the Proposed Model
To obtain insight into the internal mechanism of the proposed deep-learning-based path loss model, particular inputs were selected to explore the behavior of the deep neural network. Specifically, eight relevant features were considered, as follows: 1.
Mean value of building height: the obstruction of buildings results in propagation loss; therefore, this feature is the arithmetic mean of building heights along the profile. Figure 7 shows three selected path profiles with low, medium, and high values of this feature. These profiles were inputted into the network and their corresponding outputs were reordered. The subfigure in the bottom right of Figure 7 shows the relationship between the path losses and the path profiles. The x-axis in the subfigure is the index of the path profile, and the y-axis is the output of the proposed deep learning model, which is a normalized path loss. This subfigure can be interpreted as the mean value of building height being relevant to outcomes of the proposed path loss model. Note that although the three classes, namely, low, medium and high, may not be sufficient to derive the exact relationship, e.g., linear, cubic, quadratic, etc., they are sufficient to verify the relevance between the feature and the path loss.

2.
Standard deviation of building height: this feature refers to the standard deviation of building heights along the profile. Figure 8 shows three selected path profiles with zero, medium, and high values of this feature. These three profiles were inputted into the network and their corresponding outputs were reordered. The subfigure in the bottom right of Figure 8 suggests a positive linear correlation between this feature and the predicted path loss.

3.
Normalized mean value of building distance: analogously to the two aforementioned features which take the statistics in the vertical axis, this feature and the subsequent one take the statistics in the horizontal axis. This feature is the arithmetic mean of the distances from the Tx to the buildings along the profile. The mean value is normalized by the Tx-Rx separation distance. Figure 9 shows three selected path profiles with buildings that are near the Tx, far from the Tx, and in between the Tx-Rx pair. These profiles were inputted into the network and their corresponding outputs were reordered. The subfigure in the bottom right of Figure 9 indicates a non-linear relationship between this feature and the predicted path loss.

4.
Normalized standard deviation of building distance: this feature refers to the standard deviation of the distances from the Tx to the buildings along the profile. The standard deviation is normalized by the Tx-Rx separation distance. Figure 10 shows three selected path profiles with concentrated buildings, scattered buildings, and moderate buildings. These profiles were inputted into the network and their corresponding outputs were reordered. The subfigure in the bottom right of Figure 10 indicates a non-linear relationship between this feature and the predicted path loss.

5.
Building density: in addition to the above two statistics in the horizontal axis, this feature describes the percentage of buildings along the profile. Figure 11 shows three selected path profiles with low, medium, and high values of this feature. These profiles were inputted into the network and their corresponding outputs were reordered. The subfigure in the bottom right of Figure 11 shows that this feature and the predicted path loss have a positive relationship that can be approximated in a linear form.

6.
Average building width: this feature refers to the arithmetic mean building width along the profile. It also implies the average street width. Figure 12 shows three selected path profiles with narrow buildings, wide buildings, and moderate buildings. These profiles were inputted into the network and their corresponding outputs were reordered. The subfigure in the bottom right of Figure 12 indicates a non-linear relationship between this feature and the predicted path loss 7.
Distance to the nearest building from the Rx: in general, the path loss tends to be high when a high building is close to the Rx. This feature refers to the distance from the Rx to the nearest building along the profile. Figure 13 shows three selected path profiles with a building far from the Rx, near the Rx, and at a moderate distance. These profiles were inputted into the network and their corresponding outputs were reordered. The subfigure in the bottom right of Figure 13 indicates that this feature is relevant to the path loss prediction. 8.
Height of the nearest building from the Rx: this feature is the height of the nearest building along the profile from the Rx. Figure 14 shows three selected path profiles with low, medium, and high values of this feature. These profiles were inputted into the network and their corresponding outputs were reordered. The subfigure in the bottom right of Figure 14 suggests a positive linear correlation between this feature and the predicted path loss.
To obtain insight into the internal mechanism of the pr path loss model, particular inputs were selected to explore the network. Specifically, eight relevant features were considered 1. Mean value of building height: the obstruction of build loss; therefore, this feature is the arithmetic mean of bui file. Figure 7 shows three selected path profiles with low of this feature. These profiles were inputted into the netw outputs were reordered. The subfigure in the bottom rig lationship between the path losses and the path profiles is the index of the path profile, and the y-axis is the o learning model, which is a normalized path loss. This su the mean value of building height being relevant to out loss model. Note that although the three classes, namely, not be sufficient to derive the exact relationship, e.g., l they are sufficient to verify the relevance between the fe  To evaluate the quality and appropriateness of the explainability of the proposed deep neural network, the aforementioned features were used to model the path loss based on the approach of linear regression. For single-frequency scenarios, the log-frequency term in Equation (2) can be merged into the left-hand side. Here, the coefficient γ in Equation (2) is set to two. The path loss for the kth Tx-Rx pair is given by where z i,k , for 1 ≤ i ≤ 8, stands for the eight aforementioned features. α, β, u i , and v i are model parameters to be optimized. Although some features seem to be linearly correlated with the path loss prediction, their quadratic forms are also considered in Equation (8), because they do no harm to the path loss model. To achieve the optimal model parameters based on the least-square regression, Equation (8) can be expressed in matrix form as: where: Y = y 1 y 2 · · · y M T (10) 10log 10 R 2 1 z 1,2 z 2,2 · · · z 8,2 z 2 As a result, the optimal parameter that minimizes the fitting error is given by [6]: Additionally, the predicted path loss can be expressed as: where z k = 10log 10 R k 1 z 1,k z 2,k · · · z 8,k z 2 1,k z 2 2,k · · · z 2 8,k .
ci. 2021, 11, x FOR PEER REVIEW 2. Standard deviation of building height: this feature refers to building heights along the profile. Figure 8 shows three s zero, medium, and high values of this feature. These three p the network and their corresponding outputs were reord bottom right of Figure 8 suggests a positive linear correla and the predicted path loss.  with buildings that are near the Tx, far from the Tx, and These profiles were inputted into the network and their c reordered. The subfigure in the bottom right of Figure 9 tionship between this feature and the predicted path los  files were inputted into the network and their corres dered. The subfigure in the bottom right of Figure 11 sh predicted path loss have a positive relationship that can form. Figure 11. The relevance of the density of buildings to the proposed files with a building far from the Rx, near the Rx, and at a profiles were inputted into the network and their corresp dered. The subfigure in the bottom right of Figure 13 in relevant to the path loss prediction.  To evaluate the quality and appropriateness of the expla deep neural network, the aforementioned features were used to  Figure 15 shows the prediction performance of the proposed deep learning model and the explanatory model. The graphics are plotted in the sense of time series. The blue curve is the outcome of the deep learning model, whereas the red dots are the values predicted using the explanatory model. To quantify the accuracy of the explanatory model, the accuracy metric is defined as the correlation coefficient between the explanatory model and the proposed deep learning model. µ y and µŷ are denoted as the mean values of the path loss predictions by using the explanatory model and the proposed deep learning model, respectively. Additionally, σ y and σŷ are denoted as the standard deviation of the path loss predictions by using the explanatory model and the proposed deep learning model, respectively. The accuracy metric is defined as: where E[·] is the expectation operation. As a result, the accuracy reaches 72%. Note that this accuracy metric can be also considered as the transparency of the proposed deep learning model, which is a black box in nature.
where = [10 , , , , , Figure 15 shows the prediction performance of the proposed and the explanatory model. The graphics are plotted in the sense of curve is the outcome of the deep learning model, whereas the red do dicted using the explanatory model. To quantify the accuracy of th the accuracy metric is defined as the correlation coefficient betw model and the proposed deep learning model. and are deno ues of the path loss predictions by using the explanatory model an learning model, respectively. Additionally, and are denoted ation of the path loss predictions by using the explanatory model an learning model, respectively. The accuracy metric is defined as: is the expectation operation. As a result, the accuracy re this accuracy metric can be also considered as the transparency of the ing model, which is a black box in nature.

Verifying Performance using 5G NR Measurement Data
The proposed deep-learning-based path loss model was further standalone 5G NR system in the urban environment of Taipei City 16, the measurement environment was located at latitude 25.0447 121.539456° E, and covered an area of 1 × 1 km 2 . Nine base stations GHz frequency band were included in the measurement area. Mea was placed in a vehicle driving along selected routes. The measure the reference symbol received power (RSRP), E-UTRAN cell identity cle's GNSS coordinates. Then, the path losses between the Tx-Rx p according to the link budgets.

Verifying Performance using 5G NR Measurement Data
The proposed deep-learning-based path loss model was further applied to a real non-standalone 5G NR system in the urban environment of Taipei City. As shown in Figure 16, the measurement environment was located at latitude 25.044773 • N and longitude 121.539456 • E, and covered an area of 1 × 1 km 2 . Nine base stations operating in the 2.1 GHz frequency band were included in the measurement area. Measurement equipment was placed in a vehicle driving along selected routes. The measurement reports include the reference symbol received power (RSRP), E-UTRAN cell identity (ECI), and the vehicle's GNSS coordinates. Then, the path losses between the Tx-Rx pairs were calculated according to the link budgets.  Table 3 shows the statistics of prediction e from the measurements. The conventional models and the proposed mod matches in terms of the mean prediction error. In terms of the standard de ever, the proposed model outperformed the ABG model and the CI model by about 3.41 dB and 5.99 dB, respectively. Furthermore, Figure 18 shows th distribution of the absolute error of path loss prediction. The proposed m formed the conventional models. At the 67th percentile, it reduced the e 32.1%, from 9.30 dB to 6.31 dB, when compared to the ABG model, and by from 10.95 dB to 6.31 dB, when compared to the CI model. Additionally, th model was used to evaluate the explainability of the proposed deep learning realistic 5G environment. The measurement data show that the accuracy There were 16,359 records collected during the measurement tasks. Figure 17 shows the prediction performance of the proposed path loss model and the conventional models. The black dots are the measurements, whereas the red dots are the values predicted using the proposed model. The blue and the green curves represent the predictions using the ABG and CI models, respectively. Table 3 shows the statistics of prediction error resulting from the measurements. The conventional models and the proposed model show good matches in terms of the mean prediction error. In terms of the standard deviation, however, the proposed model outperformed the ABG model and the CI model by reducing it by about 3.41 dB and 5.99 dB, respectively. Furthermore, Figure 18 shows the cumulative distribution of the absolute error of path loss prediction. The proposed model outperformed the conventional models. At the 67th percentile, it reduced the error by about 32.1%, from 9.30 dB to 6.31 dB, when compared to the ABG model, and by about 42.3%, from 10.95 dB to 6.31 dB, when compared to the CI model. Additionally, the explanatory model was used to evaluate the explainability of the proposed deep learning model in the realistic 5G environment. The measurement data show that the accuracy ρ reached 63%. 32.1%, from 9.30 dB to 6.31 dB, when compared to the ABG mode from 10.95 dB to 6.31 dB, when compared to the CI model. Additi model was used to evaluate the explainability of the proposed deep realistic 5G environment. The measurement data show that the acc

Conclusions
By applying deep learning methodology, this paper has presen based on a profile along the direct propagation path between Tx-R Figure 18. The cumulative distribution of the absolute error of path loss prediction in simulations.

Conclusions
By applying deep learning methodology, this paper has presented a path loss model based on a profile along the direct propagation path between Tx-Rx pairs in urban environments for 5G cellular communication systems. The neural network of the proposed model extracts important features and then predicts the path loss based upon them. Simulation results showed that the proposed model outperforms conventional models. The standard deviation of prediction error was reduced by 34% when compared to the conventional models. In addition to pursuing the prediction performance empowered by the nonlinear mapping of the deep learning method, this paper has also explored the internal mechanism of the neural network. Eight explainable features were tested and used to predict the path loss, based on a linear regression approach. The simulation results showed that the accuracy of the explainable model reached 72%. Furthermore, the proposed deep learning model was also evaluated in a non-standalone 5G NR network in urban Taipei City. The measurements show that the standard deviation of prediction error was reduced by 30-43% when compared to the conventional models, and that the accuracy of the explanatory model reached 63% in the realistic 5G network.