Improved Gaussian-Bernoulli Restricted Boltzmann Machines for UAV-Ground Communication Systems

Unmanned aerial vehicle (UAV) is steadily growing as a promising technology for next-generation communication systems due to their appealing features such as wide coverage with high altitude, on-demand low-cost deployment, and fast responses. UAV communications are fundamentally different from the conventional terrestrial and satellite communications owing to the high mobility and the unique channel characteristics of air-ground links. However, obtaining effective channel state information (CSI) is challenging because of the dynamic propagation environment and variable transmission delay. In this paper, a deep learning (DL)-based CSI prediction framework is proposed to address channel aging problem by extracting the most discriminative features from the UAV wireless signals. Specifically, we develop a procedure of multiple Gaussian Bernoulli restricted Boltzmann machines (GBRBM) for dimension reduction and pre-training utilization incorporated with an autoencoder-based deep neural networks (DNNs). To evaluate the proposed approach, real data measurements from an UAV communicating with base-stations within a commercial cellular network are obtained and used for training and validation. Numerical results demonstrate that the proposed method is accurate in channel acquisition for various UAV flying scenarios and outperforms the conventional DNNs.


I. INTRODUCTION
Low-altitude unmanned aerial vehicles (UAVs), also commonly referred to as drones, have enabled a plethora of personal and commercial applications including aerial photography and sightseeing, parcel delivery, emergency rescue in natural disasters, monitoring and surveillance, and precision farming [1].Recently, the interest in this emerging technology is steadily surging as many governments have already facilitated the regulations for UAV usage.As a result, UAV technologies are being developed and deployed at a very rapid pace around the world to offer fruitful business opportunities and new vertical markets [2].In particular, UAVs can be employed as aerial platforms to enhance wireless connectivity for ground users and Internet of Things (IoT) devices in harsh environments when terrestrial networks are unreachable.Additionally, intelligent UAV platforms can provide important and diverse contributions to the evolution of smart cities by offering cost-efficient services ranging from environmental monitoring to traffic management [3].
Wireless communication is a key enabling technology for UAVs and their integration has drawn a substantial attention in recent years.In this direction, the third generation partnership project (3GPP) has been active in identifying the requirements, technologies, and protocols for aerial communications to enable networked UAVs in current long-term evolution (LTE) and 5G/B5G networks [4], [5].UAV communications are fundamentally different from terrestrial communications in the underlying air-to-ground propagation channel and the inherent size, weight and power constraints.Apparently, the 3D mobile UAVs enjoy a higher probability of line-ofsight (LoS) communication than ground users, which can be beneficial for the reliability and power efficiency of UAV communications.Nevertheless, this also implies that UAV communications may easily cause/suffer interference to/from terrestrial networks [6].
Realizing full-fledged UAVs in the 3D mobile cellular network depends to a large extent on the accuracy of channel state information (CSI) acquisition over diverse UAV operating environments and scenarios, which is of paramount importance to enhance system performance.Reliable CSI is crucial for aerial communications not only in control/non-payload but also in payload data transmissions, which is one of the major challenges in these systems.Moreover, obtaining precise CSI has a great significance on physical layer transmissions, radio resources allocation, and interference management, which will help in designing robust beamforming and beam tracking algorithms as well as efficient link adaptation techniques.While several statistical air-to-ground channel models that consider the trade-off between accuracy and mathematical tractability have been studied in the literature [7], a more practical analysis to bridge this knowledge gap is still needed.
On a parallel avenue, a significant attention has been paid recently to deep learning (DL) by wireless communication community owing to its successful in wide range of applications, e.g., computer vision, natural language processing, and automatic speech recognition.DL is a neuron-based machine learning approach that is able to construct deep neural networks (DNNs) with versatile structures based on the application requirements.Specifically, several works in the open literature have utilized DL methods for channel modeling and CSI acquisition.For instance, a DL driven channel modeling algorithm is developed in [8] by using a dedicated neural network based on generative adversarial networks that is designed to learn the channel transition probabilities from receiver observations.In [9], a DNN-based channel estimation scheme is proposed to jointly design the pilot signals and channel estimator for wideband massive multiple-input multiple-output (MIMO) systems.
In this paper, we propose a DL-based UAV channel predictor by employing the advanced Gaussian-Bernoulli restricted Boltzmann machines (GBRBM) [10].The GBRBM is a useful generative stochastic model that captures meaningful features from the given multi-dimensional continuous data.It can also learn a probability distribution over a set of inputs in an unsupervised manner and it addresses the limitations of the bipartite restricted Boltzmann machine (RBM) model by replacing the binary nodes with Gaussian visible nodes that can initialize DNN for feature extraction and dimension reduction.The distinct contributions of this work can be summarized as follows: • Applying the GBRBM model to estimate the received signal power at an UAV from a cellular network during the flight, where DNNs are employed to extract features from the UAV channels as a set of blocks for channel modeling.
• Developing an adaptive learning rate approach and a new enhanced gradient to improve the training performance.Specifically, an autoencoder is used to fine-tune the parameters during the training phase by using an autoencoder-based DNN.• Verifying the effectiveness of the proposed framework through experimental measurements using real measurements data.The obtained results show that the proposed scheme outperforms the conventional autoencoders in realizing channel feature extraction.The remainder of the paper is organized as follows.Section II presents the system model and problem formulation along with describing the developed approach.Simulation results and demonstrations are given in Section III.Finally, conclusion remarks are drawn in Section IV.

II. SYSTEM MODEL AND PROBLEM FORMULATION
We consider the downlink transmission of a multi-user wireless network that consists of multi-antenna base-stations (BSs) are serving multiple randomly distributed user nodes in the presence of a UAV communicating with the BSs as well.However, the channel between the BS and UAV is affected by several parameters such as UAV altitude, antenna directivity, location, transmission power and the characteristics of the environment.To investigate the effects of these parameters on channel modeling between the UAV-transceiver and the BSs, we proposed a DL-based framework employing multiple GBRBMs for dimension reduction and pre-training utilization incorporated with an autoencoder-based deep neural network.The detailed framework of CSI estimation scheme is shown in Fig. 1, which is designed to operate systematically on the principles of the improved GBRBM model.This framework consists of two main parts: offline training and online estimation.The offline training intensifies the framework by historical data so that it can grasp the correlation of channel variations in the different UAV flying scenarios.Thus, when a CSI estimation request arrives, the framework takes the current UAV information as input data to predict the channel state online.
Among neural network models, GBRBM shows a good potential in time-series prediction owing to its ability of acquiring the unknown sequences based on historical data.GBRBM is known as Markov random fields (MRFs) and it is an undirected probabilistic graphical model [11].The proposed architecture consists of an input layer, an output layer and several hidden layers for capturing channel characteristics The input layer takes the observed data through nine different nodes N ∈ {N 1, • • • , N 9}.Specifically,, N 1, N 2 and N 3 represent the latitude, longitude, and UAV elevation angle, respectively.Additionally, N 4 and N 5 account for cell latitude and longitude, respectively, while N 6 and N 7 represent the cell elevation and the cell building.Further, N 8 and N 9 are the antenna mast height the UAV altitude, respectively.Thus, the energy function of GBRBM is defined as: where b i represents the bias of the visible layer and c j represents the bias of the hidden layer, and w ij represents the weight that connects the visible layer to the hidden layer.Further, σ i accounts for the standard deviation of the visible units.Each block has nine data inputs as explained earlier.While n > 1 for hidden layer unit h j , based on the property of MRFs and the energy function, the joint probability distribution is defined as: where Z is the partitioning function, as given below By using the joint probability function, the marginal distribution of v can be defined as: The hidden and visible layers are both conditionally independent.The condition probability of v and h are defined as follows: where N (.|µ, σ 2 ) represents the Gaussian probability density function with mean µ and variance σ 2 .Further, stochastic maximization of likelihood is used to train GDBM.The likelihood is estimated by marginalizing out the hidden neurons.
The partial-derivative of the maximization log-likelihood function is given by: where .d and .m denote the expectation computed over the data P (h|{v (t) }, θ) and model distributions P (v, h|θ), respectively.Here, θ is the parameters of GBRBM because gradient calculation needs a high computational cost.Reference [12] used a contrastive-divergence (CD) learning that proved to be an efficient approximator for the loglikelihood gradient for GBRBM.The CD learning is recalled as an alternative to calculating the second term of the log-likelihood gradient by iteration a few samples from the data by using Gibbs sampling.As a result, GBRBM parameters are derived as follows: where η denotes the learning rate.The GBRBM is updated and trained efficiently by updating (8a), (8b), and (8c).

A. Adaptive Learning Rate
Based on the maximization of the local estimate of the likelihood, the learning rate can automatically be adapted while the RBM is trained by using the stochastic gradient.Because θ = (W, b, c) is the parameter of GBRM modeling, represents the adapted parameter of learning rate η, P θ (V ) = P * θ /Z θ represents the probability density function (pdf), and Z θ is the normalization parameter for the parameter θ.The optimal learning rate can be found that can maximize the likelihood of each iteration.Nevertheless, this can lead to a big fluctuation due to the small size of the minibatch.[13] proposed that the new learning rate is chosen from the set , where η o is the prior learning rate and E is a small constant, which was chosen randomly.

B. Enhanced Gradient
The Recently enhanced gradient was proposed by [14] to update the invariant rule of the Boltzmann machines for data representation.The gradient was introduced by a bit-flipping transformation and then the rule was updated to improve the results, It has been shown that the learning of RBM can be improved by making the results less sensitive to the learning parameters and initialization.Thus, a new method to enhance the gradient is proposed in [14] to replace the (8a), (8b), and (8c).They defined the covariance between the two variables under the distribution P as given below The standard gradient in (8a) cab be rewritten as where .dm = 1 2 .d + 1 2 .m denotes the average of the model distribution and the data.The standard deviation has some potential problems.The gradient is correlated with the weights and the bias terms.Additionally, COV d (v i , h j ) − COV m (v i , h j ) is uncorrelated with ∇ bi and ∇ cj , which may lead to distract the learning with non-useful weights when there are a lot of active neurons for which .dm ≈ 1.However, updating the weights using (10) with the obtained data may bring about some issues by flipping some of the binary units in RBM from zeros to ones and vice versa.
The parameters are transformed accordingly to The energy function is equivalent to E(x+ θ) = E(x+θ)+a, where a is a constant for all the values.That will lead to an update of the model and then transformed again.The resulting model will be defined as where ∇θ represents the gradient parameters defined in (8).Therefore, it will be 2 nv+nh different update rule, where nv and nh are the numbers of hidden and visible neurons.To find the maximum likelihood updates, [13] proposed a new gradient weighted sum 2 nv+nh with the following weights: Due to the larger weights, the enhanced gradient is defined as: where ∇ e w ij has the same form of ( 9), where each block is connected to the upper block through the unit of the hidden layer to update the parameter in (15) layer by layer for GBRBM pre-training.The pre-training of GBRBM provides the initializing to deep autoencoder.The procedure for layer by layer pre-training of GBRBM parameters is given in Algorithm 1.
Algorithm 1: Pre-training the GBRBM-blocks (unsupervised learning) Update the value of ∇w ij , ∇b i , ∇c j using (15) Repeat until the convergence is met 9 end The autoencoder is used here to reconstruct the input data points that do not have a class label, and the output of the autoencoder is defined as follows: Whereas, the output of the decoder of the hidden layer can be obtained via: Afterwards, the mean-square-error (MSE) cost function is used in the fine-grained phase to optimize the proposed algorithm through the backpropagation algorithm: where N represents the data inputs.Subsequently, a SoftMax classifier is utilized to determine each individual class to which the input belongs.Subsequently, cross-entropy loss is used as a loss function.Hence, Algorithm 2 summarizes the aforementioned unsupervised training steps of the autoencoder.
Algorithm 2: Fine tuning of autoencoder (unsupervised learning) To utilize the detection error between the predicted and real labels, we use the following loss function [14]: where w o and c o denote the encoder parameters, while y and y 0 represent the predicted and real labels.Thus, the detection error between the predicted and real labels can be estimated using (19).Typically, optimizing parameters and creating classification decisions can be carried out by minimizing the loss function using the enhanced gradient algorithm with the backpropagation algorithm.Since both GBRBM and autoencoder are capable of processing real value data, i.e., GBRBM-based autoencoder (GBRBM-AE) is also applicable of real value data processing.To this end, Algorithm 3 is developed for the aforementioned supervised fine-tuning parameters.

III. PERFORMANCE EVALUATION
In this section, simulation results are provided to evaluate the performance of the proposed CSI prediction scheme for air-ground links in UAV communication systems.

A. Experiment Setup
For the considered system model, the following experiment is designed and performed to obtain practical data measurements from a real operating ground-UAV communication system.In the experiment setup, the antennas of the BSs are typically located on tall ground antenna masts, building rooftops, or sometimes hills.Within this cellular network, the UAV flies in a wide range of altitudes from ground level to around 300 meters, which is experiencing severe and diverse signal attenuation from different obstacles, (e.g.buildings, trees etc.), different atmospheric conditions (e.g.humidity), long distance from the BSs and loss of ground BS antennas main lobe.Furthermore, as the UAV ascends, more LoS communications can be achieved with different BSs, resulting to increased levels of interference and signal quality degradation.The used UAV is a quadcopter utilizing a PX4 flight controller/autopilot and a GPS, with a total take-off weight of almost one kilogram, capable of flying to higher altitudes up to 300 meters and transmitting flight data in real time through its telemetry system.The quadcopter has embedded measurement unit, a mobile handset with an embedded software for measuring the LTE signal parameters like the received signal reference power (RSRP) of the serving and the neighboring LTE cells.
The mobile handset is connected to the town LTE network using SIM card.In every measurement area, the UAV is flying from the ground level to the altitude of 300 meters and the UAV measurement unit is recording the received signal parameters throughout the flight as shown in Fig. 2. Combining the data from the UAV GPS for its position and the recorded signal parameters from the onboard measurement device, data sets are created for the development and validation of the proposed DL-based framework.

B. Numerical Results
The proposed GBRBM-based DNN scheme is implemented and fed with the collected data measurements, i.e. the total number of training vectors is 710, while the number of test vectors is 177, with 201 of unknown variables.
The optimal number of GBRBM-based DNN blocks for the pre-training phase is evaluated in Fig. 3 by comparing the measurements and the estimated values.Different numbers of Fig. 2. Aerial LTE measurements using a quadcopter.GBRBM blocks are considered starting from 2 till 7, while the errors between the estimated and measured RSS varies from −5 to 15 dBm.It can be readily seen that the best results are obtained when 6 blocks of GBRBM-based DNNs are used in the pre-training phase.The estimation accuracies are 85.1%, 87.3%, 90.1%, 92.8%, 94.1%, and 93.7%, respectively.Next, the prediction error versus the number of epochs is evaluated and presented in Fig. 4. In the pre-training phase, the difference in the RSS values between the estimated measure and the real measure decreases with the increasing of the number of epochs.Additionally, it can be clearly seen that the difference error is stable around the 500 th epoch.Hence, the epoch number of the pre-training phase is set to 250 and the epoch number of the training phase is set to 500 epochs.After the GBRBM blocks and epoch number are determined, the neuron number must be set for every layer.However, finding the optimal number of neurons is a The Difference in RSS value  nontrivial task because the tuning assortment of the neuron number is arbitrary in every layer.Thus, we empirically set the number neurons and run experiments to maximize the neuron amount of the sixth layer depending on the operation of the GBRBM-based DNN.
The learning rate is similar to the step size of the gradient descent process, namely if it is set too big or too small, the precision will be significantly affected.In particular, if the learning rate is too small, not only does the training period grow but also a local optimal solution is likely to be trapped.To examine the proposed adaptive learning rate approach, we have trained the RBMs of the hidden neurons with the traditional gradient and the same five values (1, 0.1, 0.01, 0.001, 0.0001) to initialize the learning rate.The adaptive learning rate performance during learning is presented in Fig. 5.The process can find suitable learning rate values when the enhanced gradient is used.Specifically, one can find 6 GBRBM blocks for the pre-training phase and 5 network layers for the training in the autoencoder.The neuron numbers for hidden layers of multi-block GBRBM are 64, 56, 48, 32, and 16, respectively.The learning speeds of the two phases are equally set to 0.001.
To reveal the efficiency of the proposed GBRBM-based autoencoder (GBRBM-AE), the simulation parameters are set similar to the training and pre-training algorithms in order to compare the results for 50 independent trails.The obtained results are shown in Fig. 6, where the red dots are the measurement values and the blue line is the GBRBM-AE outputs.It can be easily noticing that the predicted values are adequately close to the measurement values.Accordingly, these simulation results show that the proposed algorithm obtains accurate RSS values that can be used for CSI acquisition in various UAV scenarios.

IV. CONCLUSIONS
In this paper, a DL-based framework is developed to estimate the channel characteristics of the air-ground links using an UAV flying within a range of altitudes and communicating with a terrestrial network.This framework aims at mitigating the negative impacts of the time-varying environment and differential transmission delay by employing a GBRBM integrated with an autoencoder-based DNN.Although the superiority of RBMs in exploring the latent features in an unsupervised manner, its training is challenging as the stochastic gradient tends to high variance and diverging behavior, and the learning rate has to be manually set according to the RBMs trained structure.To circumvent these issues, a novel algorithm is proposed uses adaptive learning rate alongside with an enhanced gradient.The enhanced gradient, contrary to the traditional gradient descent, is used to expedite the learning of the hidden neurons.Finally, the validity of the proposed framework is corroborated by using real UAV signal measurements, and the experimental results have verified the accuracy of our method in learning the UAV channel model in a dynamic propagation environment.

∇c j 2 while for each epoch do 3 while m ≤ M do 4 Estimate 7 Fine 8
tune ∇ e w ij , ∇ e b i , ∇ e c j using backpropagation Repeat until the convergence is met 9 end

Algorithm 3 : 3 Using
Fine tuning of the data labels (supervised learning) Input: epoch number, v, K, GBRBM number, L Output: ∇ o w ij , ∇ o c j 1 Initialize ∇ o w ← ∇ e w m , ∇ o b ← ∇ e b m , ∇ o c ← ∇ e c m 2 while for each epoch do Backpropagation to estimate the fine tuning ∇ o c, ∇ o w 4 Repeat until the convergence is met 5 end

Fig. 4 .
Fig. 4. The average difference between the estimated and real RSS values in the pre-training phase.