electronics Dimensioning an FPGA for Real-Time Implementation of State of the Art Neural Network-Based HPA Predistorter

: Orthogonal Frequency Division Multiplexing (OFDM) is one of the key modulations for current and novel broadband communications standards. For example, Multi-band Orthogonal Frequency Division Multiplexing (MB-OFDM) is an excellent choice for the ECMA-368 Ultra Wideband (UWB) wireless communication standard. Nevertheless, the high Peak to Average Power Ratio (PAPR) of MB-OFDM UWB signals reduces the power efﬁciency of the key element in mobile devices, the High Power Ampliﬁer (HPA), due to non-linear distortion, known as the non-linear saturation of the HPA. In order to deal with this limiting problem, a new and efﬁcient pre-distorter scheme using a Neural Networks (NN) is proposed and also implemented on Field Programmable Gate Array (FPGA). This solution based on the pre-distortion concept of HPA non-linearities offers a good trade-off between complexity and performance. Some tests and validation have been conducted on the two types of HPA: Travelling Wave Tube Ampliﬁers (TWTA) and Solid State Power Ampliﬁers (SSPA). The results show that the proposed pre-distorter design presents low complexity and low error rate. Indeed, the implemented architecture uses 10% of DSP (Digital Signal Processing) blocks and 1% of LUTs (Look up Table) in case of SSPA, whereas it only uses 1% of LUTs in case of TWTA. In addition, it allows us to conclude that advanced machine learning techniques can be efﬁciently implemented in hardware with the adequate design.


Introduction
Ultra Wideband (UWB) technology has been deployed for broadband wired and wireless communications since in February 2002, the implementation of a regulation gave authorization on the use of UWB technology for telecommunications consumer in the United States by the Federal Communications Commission (FCC). Once the frequency band of 7.5 GHz not subject to licensing (FCC 02-48) has been allotted, the FCC welcomed the very high data rate (beyond Gbps) wireless communications. UWB was basically linked to waveforms without carriers (carrier-free) built from very short pulses [1,2]. In this way, a simple approved definition considered that these signals having a fractional bandwidth FB ≥ 0.25 with a frequency bandwidth ≥500 MHz can be considered UWB [2].
OFDM is a modulation technology that guarantees an orthogonality in the frequency domain since it uses the sinusoidal basis function [3]. Then, a cyclic prefix or zero padding is added to each symbol, which makes it possible to avoid the inter-symbol interference (ISI) due to the multipath channel. However, if the orthogonality between the sub-carriers is lost, it results in inter-carrier-interference (ICI), and thus OFDM system performances is degraded. In order to decrease the impact of these problems, a new Multi-band Orthogonal Frequency Division Modulation (MB-OFDM) system has been proposed in [4]. To make wireless connectivity possible between devices within the personal area network. At the same time, it is important to undergo the European Conformity (CE) and the multimedia industry current needs in the field of wireless personal area network (WPAN) with very high data-rates or for their use in future Wireless Universal Serial Bus (WUSB/IEEE 1394) [5], different standards have been implemented: Multi-band OFDM Alliance (MBOA), Wimedia and ECMA-368 [6]. In this paper, the ECMA-368 Standard is used. It points out the MB-OFDM scheme to transmit information for a wireless personal area network, however identically to all OFDM based communications systems, the ECMA-368 undergoes the large Peak to Average Power Ratio (PAPR), this drawback reduces the High Power Amplifier (HPA) efficiency at the transmitter.
In the ECMA-368 system, very efficient amplifiers are used, mainly of two different types, namely, Traveling Wave Tube Amplifiers (TWTA) and Solid State Power Amplifiers (SSPA). Unfortunately, these amplifiers are highly non-linear and thus, in order to avoid the distortion of the signal, large back-offs are needed. As result, the efficiency is substantially reduced. This phenomenon leads to the in-band distortion, which increases the bit error rate (BER) [7] and the out-band spectral re-growthwhich also increases adjacent channel interference. Numerous methods have been proposed to overcome the high PAPR problem in OFDM signals [8,9]. Among these techniques, the HPA pre-distorter [10] is one of the most promising ones because it avoids the increase on transmit energy, it does not need side information which reduces efficiency, it is only needed to be applied at the transmitter, which eases the implementation, and it does not increase the BER, among other advantages.
It should be noted that most of the results and designing parameters in this paper can also be extended to other OFDM standards [11][12][13][14].
TWTA and SSPA are the most efficient amplifiers, and are the two main amplifier choices for space-based RF communications. TWTA and SSPA are generally more advantageous for higher power and higher frequencies, at a reduced size, cost, and with improved thermal performance. Unfortunately, these amplifiers are highly non-linear [15], hence, a large back-off [16] is needed to mitigate the signal distortion. As outcome, the efficiency is substantially reduced.
To run off a large back-off, several techniques were used to reduce the power envelope fluctuations (PAPR) [8,17]. However, the non-linearity of HPA [18] still provoke a signal distortion, compromising the system performance. These non-linearity can be identified in the amplitude with the Amplitude Modulation/Amplitude Modulation (AM/AM), and in the phase with the Amplitude Modulation/Phase Modulation (AM/PM) functions [19]. In addition, those techniques, as indicated earlier, increase transmission power due to the expansion on the transmitted constellation.
In order to overcome this problem, we can pre-distort the signal before the HPA like to linearize the HPA, i.e., to linearize the AM/AM and AM/PM characteristics of the HPA. In order to better understand the effect and issues with AM/AM and AM/PM distortion, please see [20] and the references therein.
Artificial NNs are being successfully applied through a wide range of complex computational problems [21,22]. Among the different NN architectures [23], Multilayer Perceptron (MLP) is the most used one. It connects a set of input data to an appropriate output's set by using a supervised training techniques. In this paper, we developed a very simple pre-distorter architecture based on two MLPs for, respectively, AM/AM and AM/PM conversions. Then, two HPA models, namely, TWTA and SSPA [24] have been tested. Finally, the implementation of the proposal on FPGA is described and analyzed showing that it is well suited to future wireless communications systems [25].
The remainder of this paper is organized as follows. Section 2 describes the NN models for HPA pre-distortion. Section 3 presents the implementation of our proposed design on FPGA. Then, results are detailed in Section 4. Finally, Section 5 summarizes the conclusion of the paper.

NN Models for HPA Pre-Distortion
First, in this section, the signal model and the NN design are described and analyzed.

Transmitted Signal
The transmitted radio frequency ECMA-368 signal can be mathematically described as [6] x(t) = where {.} stands for the real part of the signal, T SYM is the symbol duration , N pq is the number of symbols that every packet has, F is the center frequency, and q(n) is a the mapping function of the n th symbol to the appropriate frequency.
Once the signal has been described, the pre-distortion scheme and the proposal will be presented. It should be highlighted that, although this paper uses the ECMA-368 standard for describing the signal and obtaining the results, all the recommendations and analysis can be easily extrapolated to other MB-OFDM or even OFDM system or standard, which makes the contribution of this paper very valuable.

HPA Pre-Distortion Concept
The model for a HPA can be characterized by the AM/AM and the AM/PM distortion, as it can be seen in Figure 1. The output y(t) of the non-linear amplifiers TWTA and SSPA corresponding to x(t) input, is expressed as where AM(.) and PM(.) are, respectively, the AM/AM and AM/PM distortion functions. According to Saleh's model [26], the AM/AM and AM/PM conversions can be written as where α || is the small signal gain, is the input saturation voltage of TWTA or SSPA, and A max = is the maximum output amplitude. The modified Rapp model [27] is used instead for the SSPA case. Where, the AM/AM and AM/PM conversions is given as where g is the small gain signal, s is a smoothness factor parameter, A sat is the saturation level with a similar meaning as in the TWTA, and parameters α, β, c 1 and c 2 are adjusted to match the amplifier's characteristics. A common parameter in HPA is the Input Back Off (IBO), which is defined as IBO = 10 log 10 P sat P avg (7) where the P sat represents the saturation input power and P avg denotes the average input power. The IBO accounts for how much power need to be reduced to obtain a low level of distortion output signal, and it is usually understood as a loss in link budget analysis. A summary of these operating principle is illustrated in Figure 1.
The concept of pre-distortion is to compensate the AM/AM and AM/PM distortion with an inverse non-linearity. With the formulated AM/AM and AM/PM pre-distortion functions, the amplifier input can be re-written as where AM −1 (|x(t)|) and PM −1 (x(t)) are the AM/AM and the AM/PM pre-distorter functions, respectively. The pre-distorter output can be expressed as It should be highlighted here that, although these pre-distortion schemes allow the use of lower IBO than other techniques, we still need an IBO to absorb the near-flat part of the curve due to the hard saturation in AM/AM characteristic (see Figure 2). In the case of the amplifiers in Figure 2, it should be around 4 dB of IBO for the SSPA (worst case), around 2 dB for TWT2 and 1 dB for TWTA1, while other techniques would probably need an IBO of around 8-10 dB. Of course, it will depend on the specific HPA characteristics and the hard saturation point. Since a minimum IBO is needed, the dynamic range of the HPA is reduced. However, the higher the IBO the lower the dynamic range. Thus, with pre-distortion techniques, since the IBO can be lowered, so the the dynamic range can be larger than with other techniques.

NN Pre-Distorter Architecture
The main idea behind the concept of pre-distortion is the aim of introducing inverse nonlinearities that can compensate the AM/AM and AM/PM distortion of the HPA. In order to ease the design of the predistorter and to accelerate the identification, two MLP NN are proposed for AM −1 (|x(t)|) and PM −1 (|x(t)|). The first one synthesizes the AM/AM pre-distortion function while the second one synthesizes the AM/PM predistortion function.
Each MLP NN is off-line trained using the levenberg-marquardt algorithm [28]. Once it is already trained, it is ready for the continuous and real-time operation. The maximum number of epochs = 1000, and the mean squared error (MSE) is fixed to be less than or equal to 1E-6. The off-line training process is depicted in Figure 3 and is processed according to the following steps: 1.
Decompose the original signal x(t) into magnitude |x(t)| and phase angle φ(x(t)).

2.
Apply the HPA AM/AM conversion function to the original magnitude |x(t)| to obtain HPA magnitude AM(|x(t)|).

3.
Apply the HPA AM/PM conversion function to the original magnitude |x(t)| to obtain HPA phase angle PM(|x(t)|).

7.
Finally, using NN models, the pre-distorters magnitude (AM −1 (|x(t)|)) and phase angle (PM −1 (|x(t)|)) signals are generated. Following off-line training, both NN models are used before the high-power amplifier, as shown in Figure 4. We would like to highlight that the examples of amplifiers used in this paper are only meant to obtain results and validate the performance of the proposed pre-distorter scheme. Once trained with the specific amplifier response, the NN models are able to pre-distort the input signal adequately. It is worth noting that the amplifier's response can vary during the time. However, if the NNs have been trained with enough data and possibilities, the NNs will be able to follow these changes real time even if the amplifier changes its behavior. Obviously, there is a limitation on the possible changes, but it is robust enough for normal operation.
Going deep into the NN architecture of each MLP pre-distorter used, in Figures 3 and 4 consists in 3 layers as illustrated in Figure 5.

•
The input layer: receives the input signal of the system. • The hidden layer includes: Mathematically each neuron output can be expressed as where P is an input vector P = (P 1 , P 2 , P 3 ..., P i ) T , W is set of synaptic connections also known as the set of weights W = (W 1 , W 2 , W 3 ..., W i ) T these weights multiply the input to get WP and b is the added bias to WP. For the MLP NN hidden layer, L = 1 is the number of inputs of each neuron, f (u) = tribas(u) is a tribas function (TWTA) and f (u) = e (−u 2 ) is a radial basis function (SSPA) while for the MLP NN output layer L = 4 in case of TWTA, L = 2 in case of SSPA and f (u) = u is a linear function.
Since radbas function uses an exponential calculation, in [29,30], authors proposed a new approximation to express the exponential function using Taylor series. It has been shown that it consumes less FPGA resources and does not require any memory blocks. In fact, in this paper, this approximation for the SSPA predistorter is adopted.

FPGA Implementation
To implement the proposed NN pre-distorters, an FPGA has been used for the benefits they offer [31,32]. Indeed an FPGA enables a higher sampling frequency, tolerates higher data rates and provides real-time processing [33]. Since the training of the NN pre-distorters is carried out off-line, only the real-time part, i.e., their layers, will be implemented on an FPGA, without the need of implementing the learning algorithm. Figures 6 and 7 represent the architecture of a neural networks, of the AM/AM predistorter for TWTA and SSPA, respectively, implemented using Xilinx system generator [34]. To achieve the proposed implementations signed fixed-point representation has been adopted, allowing a better computational speed and minimal resources consumption at expenses of a reduced degradation. The number of bits has been optimized to obtain the best trade-off between speed, space and degradation. Each sample is encoded on 16 bits: 5 bits are reserved for the integer part, 10 bits for the fractional part and one bit of sign. It should be noted that the number of bits at the transmitter side is not usually a problem and it is fixed to the maximum number of bits at Digital to Analog Converter (DAC) to maximize the dynamic range of the transmitted signal. It is worth noting that for AM/PM pre-distorter, the same NN architecture implementation is adopted, the difference is that the weights and bias take a different values, for TWTA and SSPA, respectively. It should be noted that complex multipliers are needed to operate in the output layer, which corresponds to a four real-valued multipliers pipe-line architecture in the implementation as a trade-off between complexity, efficiency and resources.
To evaluate the performance of our implementation, several metrics have been used, namely: the FPGA resources consumption, the bit error rate and finally the power spectral density. Table 1 shows the consumed resources on virtex-4, virtex-5 and virtex-6 FPGAs. The table also shows the maximum frequency supported (it is worth noting that the maximum frequency stands for the maximum throughput supported by the designs). From Table 1, it can be concluded that the TWTA pre-distorter is faster and less resources consuming than the SSPA. This can be justified by the use of radial basis function based on exponential function approximation, leading to slower and more complex calculation. Since the HPA is usually imposed by the application, we need to guarantee that both design can be efficiently implemented. Table 2 and Figure 11 show the power consumption for TWTA and SSPA pre-distorters. It is worth noting that the TWTA pre-distorter consumes less power than the SSPA predistorter. It can also be seen in Figure 11 that the novel FPGA architectures (Spartan) are more efficient than older ones, and our proposal can better exploit the optimization characteristics on these devices. It is especially relevant that in Spartan 6 boards, the power consumption of our proposal could even be neglected, which is a relevant contribution in this context. The power estimation has been obtained with Xilinx Power Analyzer, using Simulation Activity Files (SAIF or VCD) for accurate power analysis, which guarantees enough accuracy on the results. The proposed MLP NNs have been simulated and implemented on FPGA, following ECMA-368 standard. If nothing is indicated in other sense, the IBO has been fixed to 2 dB, a very optimistic value for realistic systems. Figure 12 shows the transmitted constellation after the TWTA and SSPA without the pre-distortion (warped constellation), and when using our proposal (close to the original constellation). It can be seen that, even in this taught conditions, the proposed design is able to work properly. In Figure 12, it can be concluded that using the proposed NN pre-distorters, the constellation is like the original signal without distortion, which will greatly improve the bit error rate. To make sure that our implementation for both designs does not affect the performance of the ECMA-368 wireless communication system, the Bit Error Rate (BER) is also obtained and analyzed for different standard channels CM1, CM2, CM3 and CM4. In order to do so, a JTAG hardware co-simulation [35] is used to accelerate the simulation of the whole implemented designs on the FPGA platform. In Figure 13, only the proposed NNs has been implemented on the hardware while the rest is simulated by software. The software transmits a data frame to the hardware at each clock cycle for processing. For a fast transmission, both software and hardware communicate through a JTAG or Ethernet cable. As shown in Figures 14-17 the system performance does not undergo any degradation for both TWTA and SSPA pre-distorters for different channels CM1, CM2, CM3 and CM4 with respect to the ECMA-368 standard. The mean input power to generate these figures was 22.5 dBm. The transmit power was kept constant and we varied the noise power. The input power saturation of the HPA was 30 dBm. As it can be observed in the figures, the degradation is lower than 0.4 dB for lower data rates and less than 0.1 dB for higher data rates. These results shows a twofold conclusion. First, the optimization carried out in terms of number of bits works because the degradation is very low. Second, it is possible to implement a real-time high data rate pre-distorter using FPGA. It can be seen that even at very high data rate of nearly 500 Mbs the system is working properly. In Figure 18, the power spectral density (PSD) of HPA output signal with and without NN pre-distorters is shown. It can be observed that by using our proposals, the PSD regrowth is negligible. In fact, it is about 5 dB for the TWTA, whereas it is 7 dB for the SSPA, which is very reduced compared to the original signal.

Conclusions
In this paper, a novel and efficient architectures for HPA non-linearity pre-distortion have been designed, optimized and implemented on FPGA. Then, it has been tested by using two types of HPA: TWTA and SSPA. To evaluate the performance of our implemented designs, three metrics were used: resource consumption, bit error rate and power spectral density. By using the proposed pre-distorters, the modulation constellation is not modified with respect to the original while respecting the demodulation slicer. In addition, a low consumption of resources is used, about 1% in case of TWTA, which makes it feasible to be implemented with the rest of ECMA-368 transmission chain on the same FPGA. In order to make sure that our implementations do not degrade the performance of the proposed wireless communication system standard, we carried out a bit error rate simulation and plotted the power spectral density. The results show that the system does not undergo any BER degradation for both TWTA and SSPA pre-distorters, with a negligible PSD regrowth. For future works, this proposal can be of interest in 5G and beyond communication systems.