Machine Learning Enabled Performance Prediction Model for Massive-MIMO HetNet System

To support upcoming novel applications, fifth generation (5G) and beyond 5G (B5G) wireless networks are being propelled to deploy an ultra-dense network with an ultra-high spectral efficiency using the combination of heterogeneous network (HetNet) solutions and massive Multiple Input Multiple Output (MIMO). As the deployment of massive MIMO HetNet systems involves a high capital expenditure, network service providers need a precise performance analysis before investment. The performance of such networks is limited because of presence of inter-cell and inter-tier interferences. The conventional analytic approach to model the performance of such networks is not trivial, as the performance is a stochastic function of many network parameters. This paper proposes a machine learning (ML) approach to predict the network performance of a massive MIMO HetNet system considering a multi-cell scenario. This paper considers a two-tier network in which the base stations of each tier are equipped with massive MIMO systems working in a sub 6-GHz band. The coverage probability (CP) and area spectral efficiency (ASE) are considered to be the network performance metrics that quantify the reliability and achievable rate in the network, respectively. Here, an ML model is inferred to predict the numerical values of the performance metrics for an arbitrary network configuration. In the process of practical deployments of future networks, the use of this model could be very valuable.


Introduction
To support the upcoming novel applications, such as IoT, self-driving cars, Industry 4.0, smart healthcare systems, AR/VR services, fifth generation (5G) and beyond fifth generation (B5G) wireless networks aim to achieve ultra-low latency and ultra-reliability with a multi-gigabit transmission rate [1]. The combination of heterogeneous network (HetNet) solutions and massive MIMO promises significant improvement in the physical layer performance by deploying an ultra-dense network with an ultra-high spectral efficiency [2,3]. Massive MIMO is the next generation MIMO system that significantly enhances the spectral efficiency of the communication link compared with its conventional counterpart. With this technology, a few hundred antennas are deployed at the base-station (BS) to serve a few tens of active users in same time-frequency grid [4,5]. Massive MIMO is considered a key technology for upcoming wireless generations to support 100× data rates per user and per cell by implementing adaptive beamforming and spatial multiplexing technologies with large antenna arrays [6]. HetNet is a network densification technique in which several classes of low-powered transmitters are deployed with the existing macro cells sharing the same spectrum. The low-powered small-/pico-cells are deployed to target highly concentrated user groups. The network densification process significantly improves the coverage penetration and area spectral efficiency of the network with non-uniformly distributed users [7]. By optimizing resource utilization and network performance, HetNets are going to be a principal candidate for the implementation of 5G and B5G networks [8].
For achieving customer satisfaction, which is the aim of network service providers, the latter continuously upgrade the network for maximizing the coverage probability and achievable rate for the end users. The deployment of massive MIMO HetNet systems for the upcoming network generation needs precise planning and high capital expenditure. Hence, before deployment, network service providers must optimize network parameters in order to meet the desired goal, which requires precise performance analysis. The analysis and estimation of massive MIMO system performance before practical deployment has been a major concern of research for the past few years. An asymptotic analysis of the coverage probability and sum-rate of a single-cell massive MIMO system has been presented in the literature [9]. The performance of single-cell downlink massive MIMO in terms of spectral efficiencies and link reliability using various precoding techniques has been analyzed and compared by the authors of [10]. Gao et al. have proposed a performance evaluation technique for massive a MIMO system based on propagation data [11]. Feng et al. have modeled the user-interference power distribution in a single-cell multi-user massive MIMO system using Gamma function, and formulated asymptotic deterministic equivalences for sum-rate and outage probability in terms of tight-form approximation of the model [12].
The performance of a massive MIMO system in a multi-cell scenario is limited because of the presence of inter-cell interference. A typical user in a multi-cell scenario associated with a given BS receives signals from other BSs as interference. Liang et al. have done a statistical analysis of the interference present in a massive MIMO system, considering inter-cell interference as the dominant component [13]. Li et al. have derived a large-scale approximation of the downlink signal power to interference plus noise power ratio (SINR) in a multi-cell massive MIMO system with their proposed MMSE precoder [14]. Adhikary et al. have done a tractable analysis of the interference in uplink for a large-scale antenna system [15]. The closed form outage probability of a typical user has been derived in terms of BS density and the maximum number of users served by a BS for a multi-cell massive MIMO system, assuming the BSs are distributed randomly following a Poisson point process (PPP) [16].
Similarly, in HetNet, the network performance is limited by inter-tier interference. The interference experienced by a typical user in a k-tier HetNet has been accurately modeled by the authors of [17], where each tier differs by transmitting power and cell density. Closed-form approximation has been made for the coverage probability, traffic off-loading, and sum-rate considering inter-tier interference. The analysis of HetNet with a multiple antenna system is relatively complex because of the random matrix channel. The expressions for success probability and area spectral efficiency have been formulated for MIMO heterogeneous cellular networks using a Toeplitz matrix representation [18]. With proper analysis of the interference and its cancellation, the association of massive MIMO with HetNet is a promising physical layer solution for next generation wireless networks with several-fold increases in area spectral efficiency (ASE) and coverage probability (CP) [19]. The detailed system model and performance analysis of massive MIMO with HetNet is discussed in the literature [20] and the references therein. The impact of different network settings on the virtual coverage areas for massive MIMO-enabled HetNet has been explicitly studied in the literature [20]. With a theoretic framework for massive MIMO HetNets, tractable expressions have been obtained [21] for evaluating the average achievable rate and ASE. The performance of massive MIMO HetNet systems with limited channel information is discussed in the literature [22]. A wireless backhaul-based downlink rate enhancement technique for multi-antenna HetNet is presented in the literature [23]. A tractable approach is developed for evaluation of the spectral and energy efficiency in a massive MIMO-enabled three-tier network considering the presence of eavesdroppers [24].
The performance of massive MIMO-enabled HetNet in a multi-cell scenario is limited because of the presence of both inter-cell and inter-tier interferences. Therefore, the conventional analytic approach to model the statistical behavior of overall interference is Sensors 2021, 21, 800 3 of 12 not trivial, as it needs to include many stochastic network parameters. To overcome this challenge, this paper has proposed an ML-based approach to model the massive MIMO enabled HetNet system for predicting the network performance. To the best knowledge of the authors, currently, in the literature, such an approach to predict the performance of such networks has not yet been considered.
The main focus of the work is to predict the performance of a two-tier network, in which the base stations of each tier are equipped with massive MIMO systems working in a sub 6-GHz band. This paper assumes perfect channel state information (CSI) at the transmitter. The analysis for imperfect/outdated CSI is beyond the scope of the work and may be considered in future. In this work, the coverage probability (CP) and area spectral efficiency (ASE) are considered to be the network performance metrics that quantify the reliability and achievable rate of the network, respectively. Both metrics are stochastically related with the network parameters. Two separate supervised ML models are inferred to predict the numerical values of CP and ASE from a given set of network parameters of an arbitrary network configuration. The rest of the paper is organized as follows: the system model for a multi-cell massive MIMO HetNet system is given in Section 2, the ML enabled performance prediction model in Section 3, and Section 4 concludes the paper.

Network Topology
A HetNet system deployed in a bounded area, A ⊂ R 2 , with randomly placed BSs forming K tiers is considered. Each tier is distinguished by a unique transmit power, density (the average number of BSs per unit area), and antenna configuration providing service to same coverage area in the same frequency band. The spatial locations of the base stations of each tier are represented with a stochastic two-dimensional point process. For the tractable analysis, the spatial locations of the BSs of each tier ae modeled with an independent Poisson point process (PPP). Φ k (A) ∼ PPP(λ k ), k = 1, 2, K, where λ k is the density of the k-th tier, which captures the worst-case scenario [25]. In the given bounded area, L k numbers of k-th tier BSs are being deployed with a uniform transmitting power, P k . This work considers K = 2, representing macro-base stations (MBS) acting as an umbrella cell, under which many low-powered pico-base stations (PBS) are being deployed in the vicinity of user hotspots in order to provide uniform service. The BSs of each tier are equipped with massive-MIMO transmission systems capable of serving multiple users in same time-frequency grid, considering intra-cell SDMA. Each BS of the k-th tier is equipped with N k transmit antennas, serving R k single antenna active users in the same time-frequency grid (R k < N k ). The schematic representation of the given model is presented in Figure 1.

Channel Model
In this work, both tiers are working in a sub 6-GHz band. An Orthogonal Frequency Division Multiplex (OFDM) communication system is assumed, resulting in a flat-fading

Channel Model
In this work, both tiers are working in a sub 6-GHz band. An Orthogonal Frequency Division Multiplex (OFDM) communication system is assumed, resulting in a flat-fading channel for each sub-carrier; the channel between the i-th antenna in the BS of the m-th cell of k-th tier, and the j-th user associated with the n-th cell of the l-th tier, has been modeled with a single-tap channel coefficient [15].
The subscript (mi)k and superscript (nj)l represent the index of the transmitting antenna at the BS and user, respectively. γ (m)k ∈ R + is the combination of the path attenuation, depending on the distance between user and the BS, and the shadowing effect due to the propagation environment. The value is modeled with 10log 10 (γ n m ) = (−127.8 − 35 log d n m + X σ 2 )dB, where d n m is the Euclidean distance between the corresponding n-th user and the m-th BS, and X σ 2 is a log-normally distributed random variable with zero mean and σ 2 dB variance [26]. h (nj)l (mi)k ∈ C is a random variable drawn from an independent Rayleigh distribution, h (nj)l (mi)k ∼ ℵ(0, 1). Considering the block-fading channel, the expected value of the channel coefficient within the coherence time is given by The channel coefficient vector between the BS of the m-th cell in the k-th tier, and the j-th user of the n-th cell in l-th tier is as follows (2)

Interference Analysis
Using a matched filter beamformer, the transmit vector of the BS of m-th cell in k-th tier is as follows Here, s (mj)k is the information symbol intended for the j-th user in same cell. The signal received at the j-th associated user of the n-th cell in the l-th tier is as follows Here, z is the additive Gaussian noise, z∼ ℵ 0, σ 2 z .

of 12
The expression of inter-cell interference is as follows Putting Equations (3) and (8) in Equation (7), The intra-cell interference is as follows Hence, the received signal, The first term of the r.h.s. of Equation (11) is the weighted value of the intended signal. The signal power to interference plus noise power ratio (SINR) experienced at the receiver is given by the following: Here, I total = I intra−cell + I inter−cell + I inter−tier .

Performance Metrics
In this work, two fundamental performance metrics are considered as evaluation parameters of the network performance [27].

Coverage Probability (CP)
CP is the measure of the reliability of a typical transmission link, and is defined as the probability that a typical mobile user is able to achieve some threshold SINR (Th), given by the follwing For successfully running any given application, the user needs to have an SINR value of more than a minimum value. If the SINR experienced by any user drops below the desired minimum value, the customer satisfaction would be compromised. Hence, a higher value of CP implies a better quality of experience (QoE).

Area Spectral Efficiency (ASE)
ASE is a measure of spectral reuse efficiency in the network and is defined as the sum of the average data rates per unit bandwidth normalized with the total service area (bits/sec/Hz/Km 2 ), given by the following, A higher value of ASE implies a higher achievable sum rate for the network, which allows a greater number of users to get service from the network.

Performance Prediction Model
Before deploying a complex HetNet with each tier supporting massive MIMO, network providers are interested in predicting the overall network performance. The analysis provided in the previous section shows that the network performance metrices are functions of various network parameters, such as the number of transmitting antennas at BSs, the number of active users associated per cell, and the transmitted power of each tier. However, the numerical values of the parameters are stochastic variables because of the stochastic network topology and user location. Moreover, for a given topology, the instantaneous performances of the network are also stochastic processes that depend on the stochastic behavior of multipath channel fading coefficients.
In this work, a two-tier network (K = 2) is considered-microcell and picocell. In the given scenario, the numerical values of the network performance metrics (PMs), either CP or ASE, are considered to be stochastically influenced by the number of the antennas present in the BS of both tiers, i.e., N macro and N pico ; the number of active users served in same time-frequency grid, i.e., R macro and R pico , for both tiers; and the difference in the transmitted powers of the tiers in dB, i.e., P TD = 10 log P macro P pico . Thus, the performance metrics are given by the following Here, f (.) represents an unknown stochastic function and represents random noise terms that capture the contributions of the unknown parameters that influence the performance metrics. The objective of the present work is to infer supervised learning models to approximate the unknown stochastic functions for predicting the numerical values of the network performance metrics (PMs). The process is implemented with the following steps, listed below.

Step 1: Data Preparation
In order to accurately predict the performance metrics for an arbitrary network configuration, the supervised learning model requires a quality dataset that is a collection of instances with input attributes and a leveled output. For the given configuration, the input attributes are N macro , R macro , N pico , R pico , P TD and the output is (PMs; either CP or ASE). The data set is created by running a realistic simulation for a massive MIMO HetNet system with various combinations of network parameters. For a given set of network parameters (input attributes), the simulation is carried out to evaluate the numerical values of the performance metrics given in Equations (13) and (14). The values of the network parameters and the simulated result of the performance metrics constitute a single instance of the dataset.
In each round of simulation with new input attributes, two-tier cellular networks (K = 2) were simulated, comprising macro-cells and pico-cells, where the BSs of each tier were massive-MIMO enabled. The threshold (Th) of the SINR value in the CP calculation was taken as 10dB. To capture the stochastic nature of the network topology, for each network configuration, the network was simulated 200 times and the results were averaged out. Each time, the cellular network was deployed on a torus of (10 × 10) km 2 . To capture the worst-case scenario, the spatial locations of the BSs stations were modeled using two independent PPPs, with the node densities of two layers related as λ pico = 3λ macro . The topology of a typical tow-tier simulated network where the spatial locations of BSs are modeled with two independent PPPs is shown in Figure 2.
For every network topology, the test user was considered to be static, located randomly in the torus. The user was associated to a single BS of any tier that promised the highest average received signal strength as the servicing BS. The signals from all other remaining base stations were considered to be interference. The stochastic channel fading property was captured by simulating each network topology for 1000 coherence time periods with uncorrelated channel coefficients, and the expected values of the outputs were considered. The simulation was carried for 1450 combinations of these network parameters. The ranges of values of all parameters are given in Table 1. To capture the worst-case scenario, the spatial locations of the BSs stations were modeled using two independent PPPs, with the node densities of two layers related as = 3 . The topology of a typical tow-tier simulated network where the spatial locations of BSs are modeled with two independent PPPs is shown in Figure 2. For every network topology, the test user was considered to be static, located randomly in the torus. The user was associated to a single BS of any tier that promised the highest average received signal strength as the servicing BS. The signals from all other remaining base stations were considered to be interference. The stochastic channel fading property was captured by simulating each network topology for 1000 coherence time periods with uncorrelated channel coefficients, and the expected values of the outputs were considered. The simulation was carried for 1450 combinations of these network parameters. The ranges of values of all parameters are given in Table 1.

Parameters
Range of Values 50-200 10-40 8-20 4-8 5-20 dB The outputs of the simulation were coverage probability ( ) and area spectral efficiency ( ) in bits/sec/Hz/km 2 . The output of each combination was recorded as a single instance in the dataset. The prepared dataset was used for model learning and testing.

Step 2: Hypothesis Testing and Model Selection
In the prepared dataset, the initial five columns ( , , , , ) , as shown in Table 2, represent the input attributes, and the remaining two are the model target variables (CP and ASE). The dataset was analyzed to evaluate the best fit hypothesis for our problem, stating the statistical relation between the input attributes and the target variables. The null hypothesis for the given problem was set as, "The input attributions are not linearly related with the target". The existence of a null hypothesis was tested using correlation analysis. Correlation is a measure of the strength and direction of the relationship between variables ranging between −1 to 1. The null  The outputs of the simulation were coverage probability (CP) and area spectral efficiency (ASE) in bits/sec/Hz/km 2 . The output of each combination was recorded as a single instance in the dataset. The prepared dataset was used for model learning and testing.

Step 2: Hypothesis Testing and Model Selection
In the prepared dataset, the initial five columns (N macro , N pico , R macro , R pico , P TD ), as shown in Table 2, represent the input attributes, and the remaining two are the model target variables (CP and ASE). The dataset was analyzed to evaluate the best fit hypothesis for our problem, stating the statistical relation between the input attributes and the target variables. The null hypothesis for the given problem was set as, "The input attributions are not linearly related with the target". The existence of a null hypothesis was tested using correlation analysis. Correlation is a measure of the strength and direction of the relationship between variables ranging between −1 to 1. The null hypothesis was established if the correlation coefficient was closely bound to zero, indicating no relation between the variables. Table 3 shows the correlation of the input attributes with the targets.  As the correlation coefficients were not close to zero for any of the inputs, there was sufficient evidence to reject the null hypothesis and to suggest that the PMs (either CP or ASE) showed a linear relationship with the network parameters. Hence, two multivariate linear regression models were inferred to approximate the relations of the network performance metrics (PMs) with the network parameters that will predict the network performance of an arbitrary set of network parameter values outside the training set. Hence, the hypothesis to approximate to an unknown stochastic function is given by the following: [28] Here, m is the independent input attribute vector given by [1, N macro , R macro , N pico , R pico , P TD ] T and β = [β 0 , β 1 , . . . , β 5 ] T is the tunable regression parameter vector.

Step 3: Training for Best-Fit Model
The model was trained using the dataset with Q i.i.d. instances, given by , where the q-th instance is represented by the superscript . (q) .
The loss function for optimization is based on the difference between the simulated output, and the hypothesis evaluation is [20] as follows The optimal solution corresponds to the values of the regression parameters that minimize the loss function:β = arg min β J(β|m).
The optimum solution is obtained using the gradient-descent algorithm, which starts with a random β, and updates it with the given iterative process: Here, α is the step size of the algorithm, taken as 0.01. On convergence, CP and ASE are estimated using Equations (20) and (21), respectively, Sensors 2021, 21, 800 9 of 12 Here,β CP andβ ASE are the regression parameter vectors after convergence for ASE and CP, respectively.

Step 4: Model Validation
The acceptability of the hypothesized relationships between variables was evaluated based on the residual errors. The model validation process split the dataset randomly into two sets-the first set consisted of 80% of instances that were used in training the hypothesis, and the second set had the remaining 20% of instances which were used for validating the hypothesis. The generalization ability of the trained regression models is measured using the percentage of error margin and the coefficient of determination, i.e., R 2 values. To ensure the stability of the trained model, k-fold cross validation was implemented, where the model was evaluated k times, with a unique subset for validation. In this work, it was considered to be k = 5. Table 4 provides the five-fold validation evaluation results for both models. The consistency in the evaluation results ensures stability in the models.

Step 5: Final Model Preparation with Complete Dataset
The final model was made to predict the probable output on the new data. The models were finalized by training with the complete available dataset, which was saved for operational use in later cases. In Table 5, the parameter values of the finalized models are tabulated. To visually inspect the performance of the regression process, the actual values vs. predicted values of the finalized model were plotted. Figures 3 and 4 show the actual values vs. values predicted by the finalized model for ASE and CP, respectively, where it could be seen that the model fits linearly with the trend line. The performance of the finalized models has been quantitatively evaluated based on the residue over the predicted output of the dataset, and the results are given in Table 6. To visually inspect the performance of the regression process, the actual values vs. predicted values of the finalized model were plotted. Figures 3 and 4 show the actual values vs. values predicted by the finalized model for ASE and CP, respectively, where it could be seen that the model fits linearly with the trend line. The performance of the finalized models has been quantitatively evaluated based on the residue over the predicted output of the dataset, and the results are given in Table 6.

Conclusions
This paper introduced an ML approach to predict the performance of a MIMO HetNet system considering a multi-cell scenario. The performance metrics considered in this paper are and , which are stochastic functions of the network parameters. Two separate multivariate linear regression models have been trained for the network To visually inspect the performance of the regression process, the actual values vs. predicted values of the finalized model were plotted. Figures 3 and 4 show the actual values vs. values predicted by the finalized model for ASE and CP, respectively, where it could be seen that the model fits linearly with the trend line. The performance of the finalized models has been quantitatively evaluated based on the residue over the predicted output of the dataset, and the results are given in Table 6.

Conclusions
This paper introduced an ML approach to predict the performance of a MIMO HetNet system considering a multi-cell scenario. The performance metrics considered in this paper are and , which are stochastic functions of the network parameters. Two separate multivariate linear regression models have been trained for the network

Conclusions
This paper introduced an ML approach to predict the performance of a MIMO HetNet system considering a multi-cell scenario. The performance metrics considered in this paper are CP and ASE, which are stochastic functions of the network parameters. Two separate multivariate linear regression models have been trained for the network performance metrics, with network parameters as the input attributes. The generalization ability trained models have been evaluated numerically based on the percentage of the error margin and R 2 score. The error margin for CP and ASE are found to be 12.06% and 11.32%, respectively, which are within the tolerable range for practical application. The R 2 scores for CP and ASE are 0.754 and 0.749, respectively, which are closed to 1, and are found to be satisfactorily high. The visual inspection-based performance evaluation is done using actual vs. predicted output plots. For both models, the scatter points linearly match with the respective trend lines, showing the worth of fitness. During practical deployments of 5G and B5G networks, the application of this model could be very valuable in the precise planning of network and capital expenditures. This model would help the network provider to estimate the quality of service for a given network configuration. The study of other parameters that influence the network performance and their inclusion in the model may be considered as a future direction of research.
Author Contributions: S.B., concept and setup preparation, design of system model, text and plot preparation, and review; S.R.S., methodology creation, performance prediction model selection, analysis and simulations supervision, and text editing; V.P., overview of model validation, final model preparation, data preparation, text editing, and review. All authors have read and agreed to the published version of the manuscript.
Funding: The APC was funded by the Bulgarian Science Fund research project KP06-N27/3.

Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.

Data Availability Statement:
No new data were created or analyzed in this study. Data sharing is not applicable to this article.