Complex MIMO RBF Neural Networks for Transmitter Beamforming over Nonlinear Channels

The use of beamforming for efficient transmission has already been successfully implemented in practical systems and is absolutely necessary to even further increase spectral and energy efficiencies in some configurations of the next-generation wireless systems and for low earth orbit satellites. A remarkable capacity increase is then achieved and spectral congestion is minimized. In this context, this article proposes a novel complex multiple-input multiple-output radial basis function neural network (CMM-RBF) for transmitter beamforming, based on the phase transmittance radial basis function neural network (PTRBFNN). The proposed CMM-RBF is compared with the least mean square (LMS) algorithm for beamforming with six dipoles arranged in a uniform and circular array and with 16 dipoles in a 2D-grid array. Simulation results show that the proposed solution presents lower steady-state mean squared error, faster convergence rate and enhanced half-power beamwidth (HPBW) when compared with the LMS algorithm in a nonlinear scenario.


Introduction
In the last decades, artificial neural networks (ANNs) have attracted much attention, performing specific tasks in different applications, such as clustering, prediction, classification, pattern recognition, machine learning and artificial intelligence. As ANNs are mainly designed to mimic the human brain, a considerable number of approaches only handle real-valued signals [1][2][3][4]. However, some engineering problems are intrinsically dependent on complex-valued signals (e.g., channel equalization and beamforming). In order to circumvent this limitation, ANN algorithms based on complex numbers have already been proposed for some applications, such as channel equalization [5][6][7] and adaptive beamforming for wireless receivers [8][9][10][11].
Digital communication systems over wireless channels may suffer severe signal distortions due to multipath propagation, additive white Gaussian noise (AWGN) [12,13], Doppler effects and, not infrequently, nonlinearities at the receiver front-end and at the transmitter high power amplifier [5,7]. Since nonlinear impairments usually worsen the performance of linear channel equalizers, nonlinearities in the channel are better dealt with using robust nonlinear equalizers [5,7,14]. In this context, based on the phase transmittance radial basis function neural network (PTRBFNN) equalizer [7], a blind fuzzy controller algorithm was applied to increase the concurrent neural network equalizer (CNNE) convergence speed and decrease the residual mean squared error (MSE) [5]. Another equally important technique is the butterfly neural equalizer (BNE) which, when applied to optical communications with two-dimensional digital modulation, is able to mitigate nonlinearities in the photo-electric converters and simultaneously compensate chromatic and polarization mode dispersions [6].
For beamforming at the receiver, ANN architectures such as the bi-dimensional neural beamformer with joint error (BNB-JE), the butterfly neural beamformer (NB-Butterfly), and the beamformer neural network (BNN) are potential algorithms for improving receiver performance [9][10][11]. On the other hand, beamforming for efficient transmission is necessary to increase spectral and energy efficiencies in some configurations of the next-generation wireless systems [15]. In addition, for low earth orbit (LEO) satellites, the use of beamforming techniques is economically important to mainly reduce the power consumption and to increase the data throughput [16,17]. Also, nonlinear beamfoming algorithms can play a key role for band-limited systems employing nonlinear power amplifiers, as in satellite communication systems [18].
In communication systems with beamforming at the transmitter, arrays with several antennas are applied to focus the electromagnetic signal towards the desired receiver [19]. These arrays can be controlled by three different architectures: digital, analog, and hybrid. Usually, in digital beamforming the channel knowledge is required at the transmitter and some precoding technique is necessary [20][21][22][23][24][25], impacting the hardware with very high computational complexity and energy consumption. [26]. On the other hand, in analog beamforming the RF signals are manipulated by means of controlling phase shifters and/or variable gain amplifiers (VGAs). Although this architecture has low computational complexity and power consumption, it is less flexible and presents inferior results when compared with digital beamforming [26,27].
In the context of LEO satellites, the use of the classical digital beamforming techniques [20][21][22][23][24][25] is prohibitive due to the power and computational complexity constraints, which is why analog beamforming techniques are employed in this area [16,17]. However, by means of a digital beamforming without channel knowledge and precoding, as proposed here, it is possible to perform a low power digital architecture which is more flexible than the analog one. This problem can be modeled as a set of electric currents whose phases and amplitudes are modulated in such a way that the antenna radiation pattern points to the correct direction. A useful method to determine the beamformer electric currents is via the least mean square (LMS) algorithm. In this linear method, the filter weights, which represent the array of electric currents, are updated by a convex cost function [28] to minimize the error between the obtained and the desired radiation patterns [29]. However, LEO satellites frequently operate with high power amplifiers which suffer severe signal distortion due to the nonlinearities [18]. This nonlinear scenario reduces the LMS performance because of its linear design.
Differently from the LMS, neural networks can operate like nonlinear filters [30]. The nonlinear structure of a neural network is modeled by nonlinear activation functions in multilayer perceptrons (MLPs) or by Gaussian neurons in radial basis function neural networks (RBFNNs) [30]. The RBFNNs Gaussian neurons have two free parameters, namely the Gaussian centers and the variances. Besides, there is a linear free vector parameter of weights, which linearly weighs the output of the neurons to yield the network output [7]. Via these three free parameters, RBFNNs can represent high-order nonlinear spaces without the necessity of increasing the number of layers, reducing its complexity in comparison with deep neural networks. Although artificial neural networks have been employed for beamforming, the ANN architectures presented in the literature are unfeasible for the proposed application, since they are designed for beamforming in receiver devices or require channel information and/or precoding.
In such context, this article proposes a novel architecture of RBFNN, based on a complex multiple-input multiple-output (MIMO) RBFNN (CMM-RBF) for beamforming transmitters, in contrast to [9][10][11] which are designed to beamforming receivers. The proposed system applies a MIMO variation of the multiple-input single-output (MISO) phase transmittance RBFNN (PTRBFNN) [7] to a beamforming structure to generate a unified nonlinear solution. The PTRBFNN model was chosen due its lower computational complexity in comparison with deep neural networks, and due to its important role in avoiding any phase invariance at the output of the neurons in comparison with a complex RBFNN [7]. Results show that the proposed architecture achieves enhanced half-power beamwidth (HPBW), faster convergence rate and lower steady-state mean squared error (MSE) when compared with LMS beamforming in a nonlinear scenario.
The remainder of this article is organized as follows. In Section 2, a mathematical modeling is described for a general arrangement of antennas. The LMS algorithm and the proposed complex MIMO radial basis function for beamforming are presented in Sections 3 and 4, respectively. In Section 5, simulation results of the CMM-RBF are compared to results obtained by LMS, considering half-power beamwidth (HPBW), steady-state mean squared error, and convergence rate in a nonlinear scenario. Conclusions are discussed in Section 6.

Antenna Array Modeling
In a transmitter, when operating with an antenna array with P dipoles of length l and an arrangement of Q sensors around the antenna array, the matrix of steering vectors in which [·] T is the transpose operator. The qth steering vector ψ ψ ψ q ∈ C P×1 expresses the radiation pattern towards the qth sensor. The matrix of relative intensity of the electric field ζ ζ ζ ∈ R Q×Q is The qth element of the main diagonal of ζ ζ ζ is in which λ = c/ f is the signal wavelength, c = 299, 792, 458 m/s is the speed of light in vacuum, f is the frequency of the transmitted signal, and θ q is the zenith angle of the qth sensor. Figure 1 presents the angular position of the qth sensor (θ q , ω q ) and the related radiation pattern (d q ), for any arrangement of dipoles.
where exp(·) is the scalar exponential function, C C C ∈ R 3×P is the matrix of Cartesian coordinates (x, y, z) of the dipoles: and Ω Ω Ω ∈ R Q×3 is the sensors matrix of angular position: in which ω r is the azimuth angle of the r th sensor. Note that this modeling is applicable for any array setup in three dimensions, taking into account the matrix of Cartesian coordinates Equation (2) and the sensors matrix of angular position Equation (3). This Section was based on [31] (Chapter VI), in which the array equations are presented in a generalized matrix structure.

Least Mean Square Algorithm for Beamforming
Considering a beamforming transmission with P antennas and an arrangement of Q sensors around the antenna array, then the vector of radiation pattern g g g towards the sensors is given by where i i i ∈ C P×1 is the vector of antenna electric currents. Figure 2 presents the LMS architecture for beamforming. In order to control the array boresight, it is chosen a set of radiation conditions d d d ∈ C Q×1 , verified by Q sensors, which well describes the desired radiation pattern. Thus, with d d d, g g g and ψ ψ ψ, the LMS algorithm can be used to estimate i i i to the qth sensor at the uth training epoch by the minimization of the following cost function: where | · | stands for absolute value and (d q , g q ) are the qth target components of (d d d, g g g).
Thus, by means of the steepest descent algorithm, the update of the pth electric current of the LMS algorithm, to the qth steering vector, is given by: (5) in which η l is the LMS adaptive step and ∇ i is the complex gradient operator of i p . Applying the complex gradient operators (∇ i ) to Equation (4) yields: In

Complex MIMO Radial Basis Function Neural Network for Beamforming
As in the LMS beamforming technique, the input signal to the CMM-RBF algorithm is the set of steering vectors of Ψ Ψ Ψ, as shown in Figure 3. The CMM-RBF architecture, with N neurons, has three free parameters: the matrix of synaptic weights W W W ∈ C P×N , the matrix of center vectors Γ Γ Γ ∈ C N×P and the vector of variances σ 2 σ 2 σ 2 ∈ C N×1 . Besides the fact that the CMM-RBF is an extension of the PTRBFNN for multiple outputs, the key difference between both architectures is the linear layer which relates the obtained vector of electric currents with the desired radiation pattern.  Figure 3. Complex multiple-input multiple-output radial basis function neural network architecture for beamforming.
The output vector of electric currents is then given by Following the complex-valued radial basis function presented in [7], the nth neuron output of the CMM-RBF (φ n ), for the qth steering vector of Ψ Ψ Ψ, is where || · || 2 is the operator which returns the Euclidean norm of its argument and Re{·} and Im{·} are the respective real and imaginary parts of their arguments. Additionally, as shown in Figure 3, the output of the neurons can be represented by the vector This kernel partitioning into real and imaginary components has an important role in avoiding any phase invariance at the output of the neurons. As the steering vector phase is important to define the electric currents to the desired boresight, a complex RBFNN is not suitable for this application. As addressed in [7], the kernel of the complex RBFNN is not partitioned into real and imaginary parts, which implies that the Euclidean norm eliminates the phase component of the input signal.
Consequently, a complex RBFNN is only suitable for phase independent systems. Thus, by means of the steepest descent algorithm, the update of the CMM-RBF free parameters, to the qth steering vector, is given by: in which η w , η γ and η σ are the adaptive steps of w p,n , γ γ γ n and σ 2 n , respectively. Also, ∇ w , ∇ γ and ∇ σ are, respectively, the complex gradient operators of w p,n , γ γ γ n and σ 2 n . The CMM-RBF cost function is the same utilized in the LMS algorithm Equation (4). Applying the complex gradient operators (∇ w , ∇ γ and ∇ σ ) to Equation (4) yields: , as in the LMS algorithm. The synaptic transmittance of the nth neuron The nth element of the vector of weighted kernel β β β Similarly, the matrix of weighted centers is represented as Finally, applying Equation (9) into Equation (8), the update of the CMM-RBF free parameters for beamforming is expressed as follows: Generalizing Equation (13) to matrix structures, results in: in which [·] H denotes the transpose conjugate operator and Ξ Ξ Ξ[u] is the diagonal matrix of synaptic transmittance: Although Equation (14) minimizes the error between the obtained and the desired radiation patterns, as the neurons are dependent on exponential functions, a risk of instability is assumed if the exponential argument is positive. In order to circumvent this issue, based on Theorem A1 (Appendix A), the real and imaginary parts of each scalar component of the vector of variances is lower bounded by the limit µ > 0, which, consequently, bounds the real and imaginary parts of the neurons output from 0 to 1.
As in the LMS algorithm, each training epoch is composed of Q updates, due to the Q sensors, and at the beginning of each training epoch W . However, for u = 0, the CMM-RBF free parameters are initialized following some criterion defined by the user (e.g., based on the probability distribution of the input data).

Simulations and Discussion
The proposed complex MIMO RBF architecture was evaluated and compared with the LMS algorithm for beamforming at 2.4 GHz. Simulations consider two array arrangements with dipoles of equal length l = 0.5λ = 6.25 cm and separation distances s d = 0.25λ = 3.12 cm: (1) uniform and circular array (UCA) with P = 6 dipoles; and (2) 2D-grid array (2D-GA) with P = 16 dipoles. For each array arrangement, the matrix of steering vectors is computed via Algorithm A1 (Appendix B). The vectors of angular position θ θ θ ∈ R Q×1 and ω ω ω ∈ R Q×1 have their number of components (Q) defined by the number of sensors, which is selected by the user in a manner to well describe the desired radiation pattern. Besides, each component of the desired radiation pattern d d d is defined between 0 and 1, in which 0 is used for nulls and 1 is used for the maximum value of the radiation pattern.
As practical systems may suffer severe signal distortion due to occurrence of nonlinearities at the transmitter high power amplifier, based on [32], the nonlinearities are introduced as: Note that, if ρ 1 = 1.0 and ρ 2 = ρ 3 = 0.0, it is the linear case, since g q [u] = ψ ψ ψ T q i i i [u].
The LMS and the CMM-RBF were implemented using Algorithms A2 and A3 (Appendix B), respectively. The vector of electric currents is initialized with i i i[0] 0 0 0 + 0 0 0, for both beamforming techniques. The free parameters of the CMM-RBF are initialized as: W W W[0] = 0 0 0 + 0 0 0; each center vectors of Γ Γ Γ[0] starts with an unique steering vector of Ψ Ψ Ψ divided by 10; and σ 2 σ 2 σ 2 [0] = 5 5 5 + 5 5 5. Also, as discussed in Section 4, µ = 0.1 to bound the real and imaginary outputs of the neurons of the CMM-RBF. In order to well represent the set of inputs (steering vectors) into the nonlinear space of the proposed neural network, while maintaining a low computational complexity, we have found by trial and error that N = 4 neurons implemented in the CMM-RBF is sufficient. By trial and error, the adaptive steps of the LMS and the CMM-RBF were found to be around η l = 0.1, η w = 0.7, η γ = 0.5, and η σ = 0.5 for the linear and circular arrays. On the other hand, for the 2D-grid array, we found that the adaptive step of the LMS should be reduced to η l = 0.035. These adaptive steps were chosen to maximize the convergence rate and the HPBW, maintaining the side lobe of the radiation pattern smaller than −20 dB.
The performance of the proposed CMM-RBF is evaluated against LMS by means of the resulting MSE and HPBW for a specified boresightω. The MSE is computed considering an average of 500 simulations for eachω. In addition, via the 500 MSE computations, the mean of the normalized radiation patterns were selected to graphically illustrate the HPBW.

Uniform and Circular Array
In this scheme, six dipoles of equal length are spatially distributed as shown in Figure 4. Both of the presented beamforming techniques share the same Q = 10 restrictions, given by the position of the sensors depicted in Table A1 (Appendix C). Restrictions were generalized for the boresight angleω.  Figure 5 presents the evolution of the simulated mean squared error (MSE) of the LMS and CMM-RBF algorithms for UCA withω = 160 • . As a fast convergence rate characteristic is extremely important for LEO satellites, it is assumed here that convergence for both algorithms is reached when the respective MSE drops below −35 dB. Notice that the MSE for both algorithms decrease similarly up to the second training epoch, after that, the MSE of the CMM-RFB decreases at a much faster rate. It is clear, therefore, that the CMM-RBF achieves a faster convergence rate and delivers a 2.3 dB lower residual MSE (after only three epochs) in comparison with the LMS.  Figure 6 shows the radiation diagram of the LMS and CMM-RBF algorithms for θ = 90 • and ω = 160 • . One may note that both algorithms presented similar performance, however, the CMM-RBF enhanced HPBW by 4.12 • in comparison with LMS.  In addition, maintaining the same initialization scheme, but varying the boresight angle from 0 • to 360 • , in steps of 10 • , a number of 500 simulations were performed for each boresight. The mean HPBW and MSE of each boresight is used to compute the statistical results presented in Table 1. The proposed CMM-RBF algorithm is able to enhance the HPBW of the antenna arrays by about 4.15 • when operating under the same conditions as the LMS, taking into account the nonlinearities of the transmitter power amplifier.

2D-Grid Array
In this scheme, 16 dipoles of equal length are spatially distributed in a squared grid, as shown in Figure 7. The Q = 10 restrictions are depicted in Table A2 (Appendix C).    Moreover, varying the boresight angle from 0 • to 360 • , in steps of 10 • , a number of 500 simulations were also performed for each boresight and for both architectures, as in Table 1. Results for the mean and standard deviation are presented in Table 2. As in the UCA, the proposed CMM-RBF algorithm is able to enhance the HPBW of the 2D-grid array by about 4.15 • when operating under the same conditions as the LMS, taking into account the nonlinearities of the transmitter power amplifier. In 77.78% of the simulations, the CMM-RBF was able to maintain the side lobes of the radiation pattern smaller than −20 dB; on the other hand, the LMS only achieved this condition for 44.45% of the simulations.

Conclusions
This work presented a novel artificial neural network beamforming scheme for wireless transmission affected by nonlinearities of the transmitter power amplifier. The proposed complex MIMO RBFNN (CMM-RBF) is an extension of the PTRBFNN, used for channel equalization, which is able to handle multiple complex-valued outputs, keeping the phase transmittance information. With the proposed architecture it is possible to simultaneously achieve fast convergence rate, lower MSE and enhanced HPBW in comparison with the LMS in nonlinear scenarios.
The performance of the proposed approach was compared with the LMS for beamforming with a uniform and circular array (6 dipoles of equal length) and a 2D-grid array (16 dipoles of equal length), operating at 2.4 GHz. A set of 36 boresight angles was evaluated and from each generated radiation diagram the respective HPBW was obtained; besides, for each boresight the convergence rate was estimated for the minimum number of epochs possible, in order to obtain a faster tracking.
The proposed MIMO artificial neural network architecture proved to be robust, independent of the boresight angle, achieving faster convergence rate in only three training epochs (after crossover with the LMS) and reducing the MSE by about 2 dB when compared with the LMS algorithm. As for the HPBW, the results obtained with the CMM-RBF are 4.15 • better than with the LMS. As the nonlinear behavior of the transmitter power amplifier becomes more prominent when the number of antennas is increased and when operating with a more complex architecture (2D-GA), the LMS beamforming presented a poor performance regarding side lobe restrictions, correctly operating in less than half of the cases. Conversely, the CMM-RBF achieved the side lobe restrictions in more than three quarters of the simulations.
The proposed algorithm finds potential applications in some configurations of the next-generation wireless systems and in satellite communications. For LEO satellites, the CMM-RBF can be implemented using low-power graphical processing units (LPGPUs), taking advantage of the neuron's parallelism. Thus, in the proposed architecture the CMM-RBF can work with low-power consumption, with the ability to handle the distortions of nonlinear power amplifiers while maintaining a fast convergence rate. It should be emphasized that a fast convergence characteristic is extremely important for LEO satellites, since they orbit non-stationarily at low altitudes. Proof. By the properties of nonnegativity and definiteness, which state that the norm is always nonnegative (see [33], p. 46), let d = ||x x x − y y y|| 2 ≥ 0. Also, let g(d) = −d 2 /z, where z is a constant scalar. As g(d) is continuous for any z > 0 (see [34], p. 43), its natural domain is (−∞, +∞). In order to define the extremes of the image of g(d) it is necessary to assess the extremes of the natural domain and the second derivative test theorem (see [34], p. 186) of g(d). Firstly, we verify the extremes of g(d) for z > 0: We then analyze the first derivative, equating its result to zero: which implies that g(d) has only one point of minimum, maximum, or inflection. Due to this, verifying the second derivative of g(d) for d = 0: thus, d = 0 is the maximum point of g(d), which yields g(0) = 0 and, consequently, g(d) ≤ 0. Finally, with h(g) exp (g(d)), we can verify the image of h(g): h(g(d)) = 1, which yields h(g(d)) ∈ (0, 1] ∀d ∈ R and z > 0.