Accurate Path Loss Prediction Using a Neural Network Ensemble Method

Path loss is one of the most important factors affecting base-station positioning in cellular networks. Traditionally, to determine the optimal installation position of a base station, path-loss measurements are conducted through numerous field tests. Disadvantageously, these measurements are time-consuming. To address this problem, in this study, we propose a machine learning (ML)-based method for path loss prediction. Specifically, a neural network ensemble learning technique was applied to enhance the accuracy and performance of path loss prediction. To achieve this, an ensemble of neural networks was constructed by selecting the top-ranked networks based on the results of hyperparameter optimization. The performance of the proposed method was compared with that of various ML-based methods on a public dataset. The simulation results showed that the proposed method had clearly outperformed state-of-the-art methods and that it could accurately predict path loss.


Introduction
A cellular network typically comprises multiple base stations, with each mobile station measuring the received signal strength indicator from its neighboring base stations and transmitting this information to the base stations via radio signals [1][2][3].Path loss is a phenomenon in which the strength of a radio signal between a base station and mobile station decreases as it propagates through space.Predicting path loss is crucial in basestation positioning, because mobile stations require a minimum received signal power level to successfully decode the data received from the base station [4][5][6].
Recently, several path loss models have been proposed.Generally, these models can be classified into two groups: empirical and deterministic.Empirical models are based on the measurements obtained within a given frequency range in a specific propagation environment.These models offer statistical descriptions of how path loss is related to propagation parameters, including frequency of transmission, distance between antennas, and antenna height.For example, the log-distance path loss model employs a path loss exponent empirically determined from measurements to define the rate at which the received signal strength diminishes with the distance between a base station and mobile station [7].Additionally, a Gaussian random variable with a mean of zero is used in the model to represent the attenuation attributed to shadow fading.This model is commonly used as a fundamental reference for predicting indoor path loss.Several empirical models, including the Egli [8], Hata [9], Longley-Rice [10], and Okumura [11] models, have been developed based on measurements.The 3rd Generation Partnership Project (3GPP) is a collaborative initiative aimed at developing global standards for mobile communication technologies.The 3GPP is responsible for specifying technologies such as Long-Term Evolution (LTE) and New Radio (NR) for mobile broadband communication.The standards set by the 3GPP are continuously evolving to meet the demands of the industry and users.Organized as Releases, these standards introduce new features, enhancements, and optimizations, with each release incorporating hundreds of individual technical specification (TS) and technical report (TR) documents.Some TR documents focus on empirical models and modeling methods.For instance, in TR 38.901 [12], 3GPP introduced its three-dimensional (3-D) stochastic channel model for 5G mmWave massive multiple-input and multiple-output (MIMO) communications, spanning the range of 0.5-100 GHz.This comprehensive model includes a detailed procedure for generating link-level channel models, catering to a broad spectrum of carrier frequencies.The 3GPP employs various scenario settings, including Indoor Factory (InF), Indoor Hotspot (InH), Rural Macro (RMa), Urban Macro (UMa), and Urban Micro (UMi).Additionally, each scenario is accompanied by a comprehensive set of parameters, covering intersite distance, path loss computation, large and small-scale parameters, etc., [13,14].Empirical models are simple and tractable and require few parameters.However, because their parameters fit the measurements obtained in a specific propagation environment, these models do not always yield high prediction accuracies when applied to diverse environments.
Regarding deterministic models, they are based on the electromagnetic theory.These models offer precise path loss values at any given position using ray tracing and finitedifference time-domain methods [15][16][17].However, these models require detailed geometric data, such as a two-dimensional (2-D) or 3-D map of a specific region and the dielectric properties of obstacles to predict path loss.Additionally, because such data are generally large in volume, handling them can potentially result in a degraded computational efficiency and an extended computation time.Furthermore, if the propagation environment changes, a time-consuming computational procedure needs to be repeated.
Recently, machine learning (ML) has received considerable attention as a powerful tool in various fields, such as computer vision [18][19][20][21][22][23][24], natural language processing [25][26][27], and wireless communication [28][29][30][31][32]. Generally, ML approaches can be divided into three categories: supervised, unsupervised, and reinforcement learning.In supervised learning, data pairs of (x, y) are given, where x represents the input data of an ML model, and y represents a label.Supervised learning is performed to enable an ML model to learn a general rule that maps x to y.On the other hand, in unsupervised learning, only x is provided to the ML model; y is not provided.Unsupervised learning is performed to enable the model to discover hidden structures or patterns in x.In reinforcement learning, an agent learns to make decisions by interacting with an environment.The learning process in reinforcement learning involves the agent interacting with the environment, observing the outcomes of its actions, and adjusting its strategy to improve its performance over time.The agent learns to associate states with actions that lead to higher rewards and, through exploration, discovers optimal policies for achieving its goals.Reinforcement learning has been used for power control of base stations, scheduling, load balancing, and many more applications in wireless networks [33][34][35][36][37].
Recently, numerous supervised-learning algorithms have been introduced.These algorithms can be categorized into classification and regression algorithms based on the type of y.When y can take on values in a finite set, classification algorithms whose outputs are constrained to this finite set are employed.Conversely, if y can take any real value within a range, regression algorithms whose outputs are not limited to any specific real value are used.Since the value of path loss can be represented as a real value, path loss prediction can be regarded as a regression problem.Consequently, some studies on path loss prediction have proposed applying regression algorithms, such as the support vector machine (SVM), k-nearest neighbors (k-NN), random forest (RF), and artificial neural network (ANN), to predict path loss values.For example, the authors of [38,39] experimentally showed that ML-based models could predict path loss more accurately than empirical models could and that they were more computationally efficient than deterministic models.Based on these results, many researchers have focused on ML-based models as potential substitutes for conventional empirical and deterministic models.Motivated by these findings, we investigated an ML-based method to enhance the accuracy of path loss prediction.The main contributions of this study are summarized as follows: • A neural network ensemble model capable of accurately predicting path loss is proposed.In the proposed model, multiple ANNs are trained with different hyperparameters, including the number of hidden layers, number of neurons in each hidden layer, and type of activation function, thereby enhancing the diversity among the integrated ANNs.The final prediction results of the model were then obtained by integrating the prediction results from the ANNs.

•
The entire process of predicting path loss using the proposed method is presented.The dataset splitting, feature scaling, and hyperparameter optimization processes have been detailed.Based on the results of the hyperparameter optimization process, the top-ranking ANNs can be determined.These results and the pseudocode for the proposed method can simplify re-implementation.

•
The proposed neural network ensemble model was quantitatively evaluated on a public dataset.Additionally, for benchmarking, nine ML-based path loss prediction methods were tested: SVM, k-NN, RF, decision tree, multiple linear regression, Least Absolute Shrinkage and Selection Operator (LASSO), ridge regression, Elastic Net, and ANNs.
The remainder of this paper is organized as follows.Section 2 reviews related studies, and Section 3 describes the proposed method for path loss prediction.Then, Section 4 details the experimental setup, including the evaluation metrics and implementation of the benchmark methods, and Section 5 presents and discusses the results.Finally, Section 6 provides the concluding remarks.

Related Work 2.1. Non-ANN-Based Path Loss Prediction
As previously mentioned, many different regression algorithms can predict path loss.Based on prior research, these algorithms can be classified into two groups: non-ANNbased and ANN-based.In a study on path loss prediction using non-ANN-based approach, a path loss equation consisting of several constants and simple functions was proposed [40].Additionally, using a genetic algorithm, the authors identified constants and functions that fit the measurements well.
Instead of using a genetic algorithm, some researchers used an SVM for path loss prediction [41][42][43][44][45][46][47].The authors of [41] proposed using an SVM with a radial basis function (RBF) kernel as a tool for predicting path loss.Generally, several types of kernels can be used in SVMs; the performance and effectiveness of the SVM depend on the kernel type.In [42], the authors compared the performances of three SVMs with different kernels: polynomial, Gaussian, and Laplacian.Their results revealed that SVM with a Laplacian kernel had outperformed the SVM with the other two kernels.Moreover, the authors compared the three SVMs with two empirical models (the Hata and Ericsson 9999 models); all three SVMs outperformed the empirical models.Motivated by the results of [41,42], the authors of [43][44][45][46][47] also used an SVM for path loss prediction.
The authors of [46][47][48] used k-NN, a well-known regression algorithm, for path loss prediction.In recent studies on non-ANN-based path loss prediction, ensemble methods that use multiple ML models to achieve better performances have garnered significant attention owing to their promising results.Based on empirical evidence, it is generally observed that ensemble methods achieve better performances when their constituent models are significantly diverse [49,50].Consequently, several ensemble methods aim to enhance the diversity among their constituent models [51,52].To this end, the authors of [53] proposed an ensemble method that averaged the results from five different regression algorithms: k-NN, SVM, RF, AdaBoost, and gradient boosting.Regarding RF, it is a widely used ensemble method that constructs numerous decision trees during the training phase.Additionally, during the testing phase, it derives the average prediction of individual trees as an output [54,55].This method has been employed for path loss prediction [45][46][47][48].

ANN-Based Path Loss Prediction
Instead of using non-ANN-based approaches, several researchers have explored ANNbased methods for path loss prediction.In these approaches, hyperparameter tuning is crucial for ANNs because it directly affects their performance and generalization ability.Hyperparameters are configuration settings that are external to a model and cannot be learned from data.Unlike the weights and biases of an ANN, which are learned during the training phase, hyperparameters should be set prior to training.The hyperparameters of an ANN include the learning rate, batch size, number of hidden layers, the number of neurons in each layer, activation functions, dropout rates, and regularization strength.
To determine the optimal configuration that maximizes ANN performance, many researchers have conducted hyperparameter tuning in their studies.For example, the authors of [38,56,57] explored the relationship between ANN performance and the number of layers.Their results revealed that adding depth to an ANN by increasing the number of layers would enable accurate path loss prediction.The authors of [58] experimented with the performance of an ANN by varying the number of neurons in a hidden layer while keeping the number of hidden layers constant at one.According to their results, increasing the number of neurons improved the path loss prediction performance of the ANN.In [59], the authors proposed a differential evolution algorithm to determine the optimal number of neurons in each layer of an ANN that would achieve the best performance in path loss prediction.
Generally, activation functions are used to introduce nonlinearity into ANNs.The choice of an activation function significantly impacts the performance and generalization ability of an ANN.Several types of activation functions have been developed and used in ANNs.The RBF is a widely used activation function in path loss prediction.For example, the authors of [58,60,61] proposed an RBF neural network (RBF-NN), where the RBF was used as an activation function.In [62], the authors proposed a wavelet neural network for field-strength prediction using a wavelet function as an activation function.According to their results, the prediction performance of the wavelet neural network exceeded that of the RBF-NN.Other types of activation functions such as the hyperbolic tangent (tanh) [63][64][65][66][67][68][69][70] and sigmoid functions [71][72][73][74][75] have also been used in ANNs for path loss prediction.
Recently, several ANN variations, including the convolutional neural network (CNN), have been widely used for path loss prediction.A CNN typically consists of input, hidden, and output layers, with the hidden layers comprising one or more convolutional layers.In a convolutional layer, several convolution kernels (filters) can be used, and the dot product of each convolution kernel with the input matrix of the layer is obtained to generate feature maps.A rectified linear unit (ReLU) is commonly used as an activation function in convolutional layers; the activation maps for the feature maps are obtained by applying the ReLU, and these activation maps become the inputs to the next layer.Generally, the convolutional layer is followed by a pooling layer, and the pooling layer reduces the dimensions of data by combining the outputs of neuron clusters at one layer into a single neuron in the next layer.Through convolutional and pooling layers, CNNs can detect and extract meaningful features from images.Consequently, CNNs are commonly used to solve computer vision tasks, such as image classification and image recognition.
Owing to promising results from using CNNs in computer vision tasks, CNN-based methods have emerged in studies on path loss prediction.For example, the authors of [56] proposed a CNN-based method to predict the path loss exponent from a 3-D building map; two 2-D images obtained from a 3-D building map were utilized.One image was created by mapping the height of each building to an integer value within the range of 0-255, and the other was generated by mapping the difference between the height of the transmitter from sea level and the height of the ground from sea level to an integer value within the range of 0 to 255.The two images were stacked in the form of a 3-D tensor, which was used as the input for the CNN.The CNN was trained using synthetic data generated using a ray-tracing tool to predict the path loss exponent.
Recently, some popular CNN architectures proposed for computer vision tasks have been applied to path loss prediction.For example, the authors of [76] utilized AlexNet, which was proposed in [77], as the base model for path loss prediction.The model input consisted of a 3-D tensor constructed by stacking three 2-D matrices.These matrices contained information about the height of structures and buildings, the distance from the transmitter, and the distance from the receiver.Another study by the same authors employed AlexNet as the base model [78].In this study, the 3-D tensor was augmented with a 2-D matrix containing information about the angle formed by the line between the transmitter and receiver.The Visual Geometry Group neural network (VGGNet) [79] is another well-known CNN architecture.It can be categorized into several architectures according to the number of convolutional layers.Among them, the VGG-16 and VGG-19 architectures are typically used because their performance is better than that of other VGGNet architectures.In [80], the authors utilized the VGG-16 architecture to predict the path loss distribution from 2-D satellite images.Motivated by the idea presented in [80], the authors of [81] employed the VGG-16 architecture as the backbone to predict the path loss exponent and shadowing factor from 2-D satellite images.The residual neural network (ResNet) [82] is also a widely used CNN architecture.The ResNet architecture was used in a similar study to predict the path loss exponent and shadowing factor from 2-D satellite images [83] and in another study [84] to predict the path loss from 2-D satellite images.

Overall Process
This section details the working of the proposed path loss prediction method; Figure 1 shows a schematic overview of its process, which is divided into three phases: (1) dataset splitting and feature scaling, (2) model building and hyperparameter optimization, and (3) applying the ensemble model and performing path loss prediction.In the first phase, dataset splitting is conducted on the prepared dataset, producing training, validation, and test sets.Subsequently, feature scaling is applied to enhance the performance of the ANNs.In the second phase, the ANNs are built with different hyperparameter configurations; the hyperparameters include the number of hidden layers, number of neurons in each hidden layer, and type of activation function.During the hyperparameter optimization process, each ANN is trained and evaluated, and the results are recorded.In the final phase, the top-ranked ANNs are selected based on the evaluation results, and the final model is constructed using an ensemble of the selected ANNs.A path loss prediction is conducted on the test set using the final model.

Dataset Preparation
The dataset proposed by the authors in [85] was used in this study.To collect path loss data, these authors conducted a drive test measurement campaign at Covenant University, Ota, Ogun State, Nigeria.During the drive tests, measurements were performed along three different routes.During each measurement, the mobile station was moved away from each of the three base stations.These authors recorded terrain profile information, including longitude ( f 1 ), latitude ( f 2 ), elevation ( f 3 ), altitude ( f 4 ), clutter height ( f 5 ), and distance between the transmitter and receiver ( f 6 ), along with path loss data.Across the three routes, 937, 1229, and 1450 samples were collected; the dataset contained 3616 samples, comprising six features and path loss values as labels.In this study, we aimed to obtain a generalized neural network ensemble model rather than a site-specific model.To reach this goal, all 3616 samples were used without further divisions.

Dataset Splitting and Feature Scaling
In this study, the dataset was randomly shuffled and then split into training, validation, and testing sets.A training set was used to train the model.If the model was evaluated on the same data on which it had been trained, it might have performed well on that specific dataset but would have failed to generalize to new data (overfitting); the validation set helped detect and prevent this issue.Moreover, the validation set allowed the tuning of the hyperparameters without introducing bias from the test set.The test set provided an unbiased evaluation of the final performance of the model, indicating how well it would perform on new real-world data.Generally, separating data into training, validation, and test sets can ensure that ML models are robust, generalize well to new data, and perform reliably in real-world scenarios.More specifically, in our study, 60% of the 3616 samples were allocated to the training set, whereas 20% was assigned each to the validation and test sets.
Table 1 presents the descriptive statistics of the training dataset.As indicated in the table, the scales of the six features differed.Generally, if the features are on different scales, ML algorithms may assign greater importance to features with larger magnitudes.Additionally, these algorithms can be sensitive to the scale of the input features, thereby affecting their performance.To mitigate these issues, the features were standardized by removing the mean and scaling to the unit variance.Let x j = [ f 1,j , f 2,j , f 3,j , f 4,j , f 5,j , f 6,j ] be the jth sample in the dataset, where f 1,j , f 2,j , f 3,j , f 4,j , f 5,j , and f 6,j are the corresponding feature values of the jth sample.Then, the standard score of each feature value in x j is calculated as: where fi is the mean of f i for the training samples, and σ i is the standard deviation of f i for the training samples.

Hyperparameter Optimization
Our proposed model consists of multiple ANNs, each of which can have multiple fully connected layers as hidden layers; every input neuron is connected to every output neuron, which is a configuration commonly used in ANNs.To construct the optimal ensemble structure, hyperparameter optimization processes were executed.The considered hyperparameters included the number of hidden layers, number of neurons in each hidden layer, and type of activation function.Throughout these processes, a training dataset was used to train each ANN.Early stopping was applied to prevent the training of the ANN for an excessive number of epochs, which could lead to overfitting; a validation dataset was used to detect and prevent overfitting.For each hyperparameter configuration, the mean squared error (MSE) of the ANN was computed for the validation dataset.
For clarity, let M be the number of hidden layers in the ANN and N be the number of neurons in each hidden layer.Figure 2

Ensemble of Artificial Neural Networks
As illustrated in Figure 1, the proposed method involves a neural network ensemble model composed of multiple ANNs.To enhance the diversity among the integrated ANNs, the top T ANNs were selected based on the results of the hyperparameter optimization.For simplicity, hereafter, the selected top T ANNs shall be referred to as ANN 1 , ANN 2 , ANN 3 , • • • , ANN T−1 , and ANN T .The pseudocode for the proposed neural network ensemble method is presented in Algorithm 1.As shown in the pseudocode, the given dataset was split into training, validation, and test sets.Feature scaling was applied to each set using Equation (1).Subsequently, various ANNs were constructed with different hyperparameter configurations.Each ANN was trained using a training dataset and evaluated on the validation dataset, and the MSE results were recorded.The top-ranked ANNs were selected based on their MSE results, and the final model was constructed using an ensemble of the selected ANNs.for activation in {"ReLU", "sigmoid", "tanh"} do The key concept behind the proposed neural network ensemble model was training multiple ANNs with different subsets of hyperparameters and aggregating their predictions.Through this process, the proposed model became more robust and less prone to overfitting.The ensemble nature of the model helped improve the generalization and predictive performance.During the prediction phase, each ANN in the ensemble independently predicted input data.The predictions from all ANNs were then aggregated to produce the final prediction.In this study, the final output of the neural network ensemble model was the average of the predictions made by each ANN.For clarity, let ŷr be the path loss value predicted by ANNr.Then, the predicted path loss values from T ANNs can be represented as vector Ŷ as follows: To derive the final prediction result from Ŷ in Equation ( 2), the predictions of T ANNs were averaged.

Evaluation Metrics
Generally, using multiple metrics in the performance evaluation of algorithms provides a more comprehensive and nuanced understanding of their performance; relying on a single metric may result in an incomplete or biased assessment.Therefore, for our performance evaluation, we utilized various metrics, including the MSE, root mean square error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE), mean squared logarithmic error (MSLE), root mean squared logarithmic error (RMSLE), and coefficient of determination.For clarity, let y j be the actual path loss value for the jth sample x j in the dataset, ȳj be the predicted path loss value for x j , and S be the total number of (7) ridge regression-based, (8) Elastic Net-based, and (9) ANN-based methods.To achieve this, the scikit-learn ML library for Python was utilized.The optimal hyperparameter configuration for each model was determined using the HalvingGridSearchCV class.The nine methods are detailed below.

SVM-Based Path Loss Prediction Method
An SVM was employed for path loss prediction in [41][42][43][44][45][46][47].In our study, an SVM was implemented using an SVR class in the scikit-learn library.The SVR class is an implementation of epsilon-support vector regression and includes various hyperparameters, such as the kernel type, kernel coefficient, and regularization parameter.Table 3 presents the optimal hyperparameter combinations for SVM, as determined through the hyperparameter optimization process.The k-NN was employed for path loss prediction in [46][47][48].It was implemented using the KNeighborsRegressor class in the scikit-learn library.This class includes various hyperparameters such as the number of neighbors, type of weight function used in the prediction, and type of metric used for distance computation.Table 4 presents the optimal hyperparameter combination for k-NN, as determined using the hyperparameter optimization process.

RF-Based Path Loss Prediction Method
The RF technique was employed for path loss prediction in [45][46][47][48].In our study, it was implemented using the RandomForestRegressor class in the scikit-learn library.This class includes various hyperparameters, such as the number of decision trees in the model, the type of function used to measure the quality of a split, and the maximum depth of the tree.Table 5 presents the optimal hyperparameter combination for RF, as determined through the hyperparameter optimization process.A DT predicts a continuous value by recursively partitioning the data based on the input features and creating a tree structure in which each leaf node contains the predicted value for instances that follow the path to that leaf.It can also be employed for path loss prediction.In our study, DT was implemented using the DecisionTreeRegressor class in the scikit-learn library.This class includes various hyperparameters, such as the type of function used to measure the quality of a split, the strategy used to choose the split at each node, and the maximum depth of the tree.Table 6 presents the optimal hyperparameter combination for DT, as determined through the hyperparameter optimization process.Multiple Linear Regression (MLR) is an extension of simple linear regression, which models the relationship between a dependent variable and multiple independent variables.In a simple linear regression, there is only one independent variable, whereas in multiple linear regression, there are two or more independent variables.The coefficients of the independent variables and the y-intercept are estimated from the training samples using methods such as the least-squares method, which minimizes the MSE.Regarding MLR, it is widely used in various fields to predict outcomes, understand the relationships between variables, and determine the strength and significance of these relationships.In our study, MLR was implemented using the LinearRegression class in the scikit-learn library and employed as a benchmark method.Table 7 presents the optimal hyperparameter combination for MLR, as determined by the hyperparameter optimization process.The LASSO regularization technique is used in linear regression to prevent overfitting and encourage simpler models.Linear regression is performed to determine the coefficients of the independent variables that best fit the observed data.Regarding LASSO, it introduces a penalty term for the traditional linear regression objective function.The penalty term, denoted by L1, is proportional to the absolute values of the coefficients.For brevity, the linear regression model trained with L1 was named LASSO.LASSO was implemented using the LASSO class in the scikit-learn library and employed as a benchmark method.Table 8 presents the optimal hyperparameter combination for LASSO, as determined through the hyperparameter optimization process.

Results and Discussion
Table 12 lists the performance metrics for the ensemble models with different numbers of ANNs; it shows that the best performance of the proposed ensemble model was when T = 20.Therefore, in this study, the optimal value of T was set as 20.In other words, the proposed ensemble model consisted of 20 ANN r models (r ∈ {1, 2, • • • , 19, 20}).Clearly, the performance of the ensemble model increased as the number of ANNs increased.However, when the number of ANNs exceeded 20, the performance of the ensemble model did not improve further.Based on these results, an ensemble model consisting of 20 ANNs was selected as the final model.Clearly, the proposed ensemble model performed the best across all evaluation metrics.This was because the ensemble model comprised the top-ranked ANNs selected based on the MSE results, thereby enhancing the diversity among the integrated ANNs and enabling the model to achieve a robust and accurate path loss prediction performance.Among the considered benchmark methods, the k-NN-based method achieved the best performance, whereas the ANN-based method with the hyperparameters described in [38] achieved the worst performance.The MAE of the proposed method was approximately 1.2753, whereas that of the k-NN-based method was 2.4983.The MAE of the proposed method was approximately 1.223 less than that of the k-NN-based method.The results in Table 13 revealed that the proposed method could predict path loss accurately.Figure 3 shows the measured and predicted path loss for three survey routes.In the figure, the values of the measured path loss data were plotted against the corresponding distance.To achieve this, we sorted the data in the test set by the distance between the transmitter and receiver after splitting the test set according to the survey route.In all survey routes, the receiver encountered non-line-of-sight (NLoS) conditions, attributed to obstructions such as buildings and trees.From the figure, it is seen that the prediction performance of the proposed method aligns closely with the measured data.This result is consistent with the performance shown in Table 13. (a)

Conclusions
In this study, we propose a novel ML-based method for path loss prediction.Our approach leveraged the power of neural network ensemble learning and provided a robust and accurate prediction model.By constructing an ensemble of neural networks and selecting the top-ranked networks based on a hyperparameter optimization process, the method achieved a state-of-the-art performance in path loss prediction, as evidenced by the results of rigorous validation on a publicly available dataset.Furthermore, we comprehensively compared its performance with that of various ML-based methods.The simulation results demonstrated the superior performance of the proposed method.Future research directions may explore fine tuning the model, considering additional parameters, and expanding the dataset to ensure the generalizability of the proposed method across diverse scenarios.

Figure 1 .
Figure 1.Overall working of the proposed method for path loss prediction.
shows a heat map of the MSE values based on M and N.In the experiments shown in Figure 2a, the ReLU was used as the activation function in each hidden layer, and the ANN with M = 8 and N = 10 achieved the minimum MSE; in those shown in Figure 2b, a sigmoid function was used as the activation function and an ANN with M = 3 and N = 12 achieved the minimum MSE; and in those shown in Figure 2c, a hyperbolic tangent function was used, and the ANN with M = 1 and N = 22 achieved the optimal MSE.The results presented in Figure 2b,c showed that the MSE of the ANNs had never decreased below 79 for M values ≥ 4.

Figure 2 .
Figure 2. Heat map of MSE values based on M and N: (a) ReLU, (b) sigmoid, and (c) tanh.

Algorithm 1 2 :
Pseudocode for the proposed neural network ensemble method Input: Dataset D Output: Final ensemble model E 1: Split D into training, validation, and test sets (D training , D validation , and D test , respectively) Determine fi and σ i 3: Apply feature scaling to D training , D validation , and D test using Equation (1) 4: Set T, M, and N 5: Set the maximum epochs (max_epochs) 6: Set the number of training samples in the batch (batch_size) 7: Create an early stopping callback (es_cb) 8: Create an empty list H 9: for m = 0 to M do 10: for n = 1 to N do 11:

Figure 3 .
Figure 3. Measured and predicted path loss against distance along three survey routes: (a) Survey Route X, (b) Survey Route Y, and (c) Survey Route Z.

Table 1 .
Descriptive statistics of the training dataset.

Table 2
lists the hyperparameter configurations and MSE of the 20 highest-ranking ANNs.

Table 2 .
Hyperparameter configuration and MSE for the 20 highest-ranked ANNs.

Table 5 .
Hyperparameter optimization results for RF.

Table 12 .
Performance comparison between ensemble models with different numbers of ANNs.

Table 13 .
Performance comparison between the proposed and benchmark methods.