Joint Satellite-Transmitter and Ground-Receiver Digital Pre-Distortion for Active Phased Arrays in LEO Satellite Communications

A novel joint satellite-transmitter and ground-receiver (JSG) digital pre-distortion (DPD) (JSG-DPD) technique is proposed to improve the linearity and power efficiency of the space-borne active phased arrays (APAs) in low Earth orbit (LEO) satellite communications. Different from the conventional DPD technique that requires a complex RF feedback loop, the DPD coefficients based on a generalized memory polynomial (GMP) model are extracted at the ground-receiver and then transmitted to the digital baseband front-end of the LEO satellite-transmitter via a satellite–ground bi-directional transmission link. The issue of the additive white Gaussian noise (AWGN) of the satellite–ground channel affecting the extraction of DPD coefficients is tackled using a superimposing training sequences (STS) method. The proposed technique has been experimentally verified using a 28 GHz phased array. The performance improvements in terms of error vector amplitude (EVM) and adjacent channel power ratio (ACPR) are 7.5% and 3.6 dB, respectively. Requiring limited space-borne resources, this technique offers a promising solution to achieve APA DPD for LEO satellite communications.


Introduction
With the rapid rise of fifth-generation (5G) millimeter-wave (mmWave) communications and broadband low Earth orbit (LEO) satellites [1,2], active phased array (APA) antennas have seen unprecedented development in recent years [3][4][5]. Compared with conventional antennas, APA antennas do not require bulky servo tracking systems. They can change the radiation pattern by adjusting the excitation phases of the antenna elements in the array, thus bringing advantages such as flexible beam control and high reliability [6,7]. Higher power efficiency is required to support high data rates operation while driving power amplifiers (PAs) close to their saturation state will introduce severe nonlinear distortion. Nonlinear distortion leads to non-negligible problems such as: interference to other satellite and terrestrial communication systems due to the increased adjacent channel power ratio (ACPR), and deterioration of the error vector magnitude (EVM) leading to higher bit error rates.
To address this problem, power backoff techniques and digital pre-distortion (DPD) [8][9][10][11][12][13][14] techniques are often adopted, of which DPD techniques have drawn widespread attention because they can better improve power efficiency to fully utilize the space-borne power resources. As shown in Figure 1a,b, the DPD techniques with 4-by-4 array antennas as an example can be further divided into two categories: one is individual DPD [8,9,11],  For large-scale APA linearization, approaches with overall DPD of PAs for all array cells at the transmitter give more generic and effective solutions. The fundamental coefficients of the single-input single-output (SISO) array model include beamforming weights, antenna cross-coupling, channel coefficients, etc. As a result, many sets of trained DPD coefficients are sufficient to reduce the nonlinear distortion of mm-wave radio frequency beamforming array [12]. Ref. [13] assumed that the PAs are similar in all branches, the main beam is generated by collecting the output signal of one PA, and the memory polynomial (MP) model is used to calculate the overall DPD from the main beam and the input of the sub-array. Ref. [14] implements full-angle DPD for 5G mmWave large-scale MIMO transmitters by first compensating for differences of PAs in different transmitter chains and then linearizing the entire subarray using a common digital block. Unfortunately, since the DPD is done within the digital domain before the front-end, these approaches necessitate an RF feedback loop, and the resulting hardware resource consumption is almost equivalent to an space-borne receiver, which is unacceptably high for LEO satellite communications. Therefore, a joint satellite-transmitter and ground-receiver digital pre-distortion (JSG-DPD) technique, by reusing the ground-receiver as the "feedback loop", is proposed in this paper to achieve linearization of the space-borne APA. It is experimentally demonstrated that this technique can dramatically save space-borne resources and can achieve a promising 7.5% EVM improvement. To the best of our knowledge, this paper is the first attempt to join the receiver and transmitter for DPD in order to obtain desirable results with less cost. With the future deployment of a large number of LEO satellite systems, the search for effective high-power-efficient satellite communication techniques becomes increasingly urgent.
The rest of the paper is organized as follows: Section 2 presents the main principles of the proposed technique. Section 3 explains the algorithm for calculating the DPD coefficients of the GMP model. Section 4 shows the experimental results and analysis. Conclusions are given in Section 5.

Principle Analysis of the Proposed JSG-DPD Technique
In this section, we demonstrate the structure of the proposed JSG-DPD technique and present the adopted signal standard, the channel model, and the solution for a low SNR.

The Proposed JSD-DPD Technique
The proposed system architecture depicted in Figure 2 is mainly based on the overall DPD structure of Figure 1b. However, the space-borne feedback loop is eliminated and replaced by a noise reduction and DPD coefficient extraction system at the ground station to save satellite resources. Certainly, the ground station faces the challenge of a low signalnoise ratio (SNR) resulting from high AWGN. To address this concern, the superimposed training sequences (STS) method is proposed. The nonlinearly compressed STS signals pass through the AWGN channel between the satellite and the ground and are captured by the receiving antenna of the ground station. Noise reduction is performed by exploiting the irrelevancy of the noise, as explained in the following Section 2.4. Then the DPD coefficients of the selected model are calculated with a suitable adaptive algorithm and uploaded to the baseband front-end of the LEO satellite using the satellite-ground bi-directional link. Hence, this joint satellite-ground DPD (JSG-DPD) structure does not occupy additional space-borne resources. It can greatly improve the linearity and the power efficiency of APA, enabling the application of DPD techniques to LEO satellite communications.

OFDM Technique
Conventional digital modulation is performed on a single carrier, such as QPSK, 16QAM, etc. These single-carrier modulation methods are susceptible to inter-symbol interference (ISI), deteriorating the BER. It is subject to Rayleigh fading in a multipath propagation environment, which can cause burst BER. While orthogonal frequency division multiplexing (OFDM) divides the channel into several orthogonal subchannels in the frequency domain and adopts a subcarrier transmitted in parallel for modulation on each subchannel. In addition, a cyclic prefix is usually inserted between each OFDM symbol as a guard interval. Therefore, the OFDM system offers excellent anti-fading and anti-ISI capabilities, high spectrum efficiency, and is applicable to high-speed digital communication. The mathematical expression for the continuous OFDM baseband signal is where M represents the number of subcarriers, X u,m denotes the information carried on the m-th subcarrier in the u-th OFDM symbol, T s is the duration of an OFDM symbol, and g(t) is the window function. The discrete OFDM baseband signal is derived by sampling the continuous OFDM baseband signal at the Nyquist sampling rate t = (w + uM)T s /M, denoted as where w = 0, 1, . . . , M − 1. In order to enhance the LEO satellite-to-ground data transmission rate, this paper adopts such an OFDM signal with a high PAPR of 11.69 dB Given that OFDM symbols are composed of several independently modulated subcarrier signals, when each subcarrier is in the same or comparable phase, the superimposed signal will be modulated by the same beginning phase signal, resulting in a huge instantaneous power peak, thus bringing a high PAPR. Since the limited linear operating range of a typical PA, OFDM signals with large PAPR are likely to get into the nonlinear region of the PA, causing serious degradation of the overall system performance.

LEO Satellite-Ground Channel
As for LEO satellite communications, it is commonly necessary to consider the Doppler effect [15] and multipath effect [16]. The multipath effect is the propagation phenomenon of radio frequency (RF) signals arriving at the receiving antenna through various paths from the transmitting antenna. The electromagnetic waves of each path arrive at the receiver at different times, and thus at different phases. The superposition of multiple signals with different phases at the receiver results in a change in the amplitude of the received signal, which produces the so-called multipath fading.
When considering that the received signal consists of only multipath signal components, the probability density function (PDF) of the signal obeys the Rayleigh distribution where r and σ 2 denote the received signal amplitude, and the average multipath power, respectively. While considering the received signal is composed of a line of sight (LOS) signal and multipath signal, the PDF of the signal obeys the Rice distribution where r indicates the received signal amplitude, z represents the LOS signal amplitude, and σ 2 is the average multipath power [17]. Generally, we assume ground stations are built in non-urban open areas, thus ignoring the effect of reflections caused by buildings. Due to different atmospheric densities, scattering is typically generated in the troposphere at an altitude of about 10-20 km above the ground. For LEO satellites above several hundred km, such a link length is only 2-3% of the total link length, so the resulting multipath effect is insignificant. In addition, the inter-symbol interference (ISI) from multipath can be maximally eliminated by inserting a guard interval between OFDM symbols, where the guard interval is larger than the maximum multipath delay extension of the satellite communication channel. If the cyclic prefix (CP) is filled in the guard interval, the interchannel interference (ICI) from multipath can also be avoided. This is the commonly used means of OFDM signal transmission. Therefore, the impact of the multipath effect is ignored in this paper. When there is relative motion between the satellite and the ground station, the transmitted carrier frequency received by the ground station is frequency shifted, which is the Doppler shift phenomenon. It is harmful to digital communication using correlation demodulation, and thus cannot be ignored. As depicted in Figure 3, the actual velocity of the satellite is v, and the relative speed of the satellite to the ground station is which varies dynamically. The Doppler shift is defined as where f c denotes the carrier frequency, ∆ f c represents the carrier frequency offset, i.e., Doppler frequency offset, and c = 3 · 10 8 m/s is the speed of light. According to the standard 401.0-B of the Consultative Committee for Space Data Systems (CCSDS) [18], the maximum relative motion speed between a LEO satellite and a ground station is 10 km/s, and the maximum speed change rate is 380 m/s 2 . The maximum Doppler frequency offset of the LEO satellite-ground data transmission system based on Ka-band (28 GHz) is ±930 kHz , and the maximum Doppler frequency offset change rate is 35 kHz/s. Therefore, the time-varying Doppler effect makes it difficult for the receiver to accurately track the carrier of the signal, which undoubtedly increases the design difficulty of the receiver. In addition, the Doppler effect causes a spreading/compression of  For the Ka-band 100 MHz signal applied in this paper, the maximum spreading of the bandwidth is 3.3 kHz. Given that the time-varying Doppler effect has been solved by mature techniques [19][20][21][22], so it is not the focus of our study. Consequently, the LEO satellite-ground transmission channel is modeled as an AWGN channel in this paper. The magnitude of white Gaussian noise follows a Gaussian distribution, while the power spectral density (PSD) obeys a uniform distribution.

Noise Reduction Method
The transmission frame structure of the proposed superimposing training sequences method is illustrated in Figure 4, which mainly consists of three parts: frame header, data block, and pilot signal.
where s(n) and u(n) denote the discrete OFDM signal and the noise signal, respectively. After superimposing the above equation for H times, we obtain Since AWGN is not only irrelevant between random variables at any two distinct moments, it is also statistically independent. Hence, the noise part while the signal part is amplified H times, and the above equation is then averaged and divided by H on both sides to yield When the number H of superpositions is large enough, theoretically U ≈ 0, x (n) ≈ x(n). With this method, the valuable signal is coherent and can be accumulated, while the noise signal is irrelevant and cannot be gathered. Therefore, the SNR of the received signal is improved by increasing the signal power, which can be equivalent to noise reduction. Moreover, the frame length and the propagation time occupied by the pilot signal consisting of the superimposed training sequence, etc., can be acceptable during several minutes of LEO satellite data transmission.

DPD Coefficients Recognition Algorithm
In this section, we present an indirect learning structure based on the proposed JSG-DPD technique, focusing on the principle of computing the DPD coefficients of the GMP model using the least squares (LS) algorithm.

The JSG-DPD Indirect Learning Structure
In [28], the indirect learning architecture applied to PA linearization is proposed for the first time. As shown in Figure 5, in the field of satellite communications, the feedback loop for coefficient calculation is almost equivalent to a receiver that would be included in the satellite-transmitter. Therefore, the required cost is unacceptable for satellites; it is also for this reason that dynamic DPD techniques are not widely used in satellite communications. Ref. [29] provides a fifth order static DPD, which means that only the static nonlinearity of the PA can be handled, while the memory effect is ignored. Such a static DPD method would not improve the performance a lot, especially for broadband signals. The indirect learning structure of the proposed JSG-DPD technique is illustrated in Figure 6, in which the pre-distorter is not working until the DPD coefficients are identified. At this time, the input signal of the APA x(n) = z(n), z(n) is taken as the expected output signal of the DPD coefficients estimation module, and the signalŷ(n) after noise reduction of the received signalỹ(n) is taken as the input of the pre-distortion coefficients estimation module, where G is the gain of the pre-PA and linear APA. The signalŷ(n) is passed through the DPD coefficients estimation module (i.e., post-distortion) to obtain the output signalẑ(n), which is subtracted from the desired output signal z(n) to obtain the error signal e(n). The adaptive discrimination algorithm obtains the coefficients w of the optimal DPD coefficients estimation module by iterating the error signal. Finally, the coefficient w is directly copied as the coefficient of the pre-distorter to make the pre-distorter operate. It is worth pointing out that for the general Volterra system, the P-order post-inverse is the same as the P-order pre-inverse [30]. Thus, for our application, the post-distortion response is the same as the pre-distortion response, and that is why the coefficients can be copied directly to the pre-distorter before the APA. In addition, the Volterra series are generalized nonlinear models with memory, and DPD of Volterra models is usually implemented using P-order inverse techniques [31]. Notably, another communication link is not required because the pilot signal z(n) is only a fixed segment of the signal known to the ground station for the purpose of calculating the DPD coefficients. Therefore, the proposed JSG-DPD indirect learning structure does not occupy additional space-borne resources.

DPD Data
The calculation of the DPD coefficients is very fast after the ground station receives the pilot signal, and this calculation time is even negligible because the computing resources of the ground station are very powerful. Instead, the time to be concerned is the time to upload the DPD coefficients, which depends on the communication rate of the uplink. For this paper, the number of coefficients of the generalized memory polynomial model obtained is several tens, which can be uploaded in a short time (ms level).
Therefore, both the DPD coefficient calculation time and the DPD coefficient upload time are very short and do not illegally emit large amounts of spurious emissions to other satellites, which can affect the communication quality.
It is only possible that while waiting for the coefficients to be uploaded, the received signal quality of the ground station will degrade because the space-borne DPD coefficients have not been updated yet. After the coefficients are uploaded and the DPD coefficients are updated, the received signal quality at the ground station will be greatly improved. There is a mode that can be set to avoid the problem of temporary degradation of communication quality: when the received signal quality degrades, only the pilot signal used for updating DPD coefficients is transmitted. Normal scientific data will be transmitted after the DPD coefficients have been updated. Hence, although it takes time to upload DPD coefficients, it is not required for each transition, and the number of uploaded coefficients is extremely small.

Generalized Memory Polynomial (GMP) Model
For narrow-band signals, where the memory effect is not obvious, it can usually be implemented using polynomial models, Saleh models [32], etc. For wideband signals, the influence of the memory effect cannot be ignored, and the output signal of the APA is not only related to the current input signal, but also related to the past input signal, so it is necessary to use a nonlinear model with memory for wideband APA modeling. Memory models mainly include Volterra series models [33] and their simplified forms, such as memory polynomial (MP) [28], generalized memory polynomial (GMP) [34], Wiener model [35], Hammerstein model [36], etc. Recently, neural network models [37] have also been investigated for modeling nonlinear systems.
Usually, the MP model and Volterra series are used to model narrowband memory APAs and wideband memory APAs, respectively. While the coefficients of Volterra series model are too complex and the coefficients of MP model are too simplified. As a compromise, the GMP model can achieve better modeling results than the MP model by considering the generalized memory effect and the number of coefficients can be acceptable for LEO satellite communication.
The significance of the APA behavior model is that it can inverse model the APA and be used to compensate for the signal distortion caused by the APA nonlinearity. In this broadband OFDM signal and array antennas for the LEO satellite communication application scenario, it is necessary to trade off the computational complexity and algorithm performance. The MP model is a simplified form of Volterra for narrowband signals as well as unit antenna systems with few model coefficients. The MP model with nonlinear order K and memory depth Q can be written as where x(n) and y MP (n) are the input and output signals of the model, and c kq represents the coefficients of the model. The total number of coefficients is O MP = K · Q. Its performance is limited because it contains only the diagonal terms of Volterra. While in some application scenarios, non-diagonal terms can significantly impact the model accuracy. Although the neural network model is very flexible and adaptable, the large number of coefficients requires a large number of space-borne computing resources, so it cannot be deployed on satellites by far. The GMP model performs better for strongly nonlinear dynamical systems because it supplements the consideration of generalized memory effects than the MP model. Therefore, the GMP model is adopted as the model of pre-distorter in this paper where x(n) and y GMP (n) are the input and output signals of the model, a kl , b klm , and c klm represent the model coefficients of the MP sub-model, the lagging term, and the leading term, respectively. K a , K b , and K c stand for the nonlinear order of each branch term, L a , L b , and L c denote the memory depth of each branch term, and M b , M c are the additional degrees of freedom, which determine the maximum selectable range of memory depths for the lagging and leading terms, respectively. The total number of coefficients is

Coefficient Estimation Algorithm
Classical adaptive estimation algorithms include the LS algorithm, the least mean square (LMS) algorithm, and the recursive least squares (RLS) algorithm. These methods are also widely applied in adaptive equalization [38,39]. The LMS algorithm has the advantages of no matrix inversion, simple structure, and easy implementation. Still, it is unsuitable for DPD application scenarios since it is prone to poorly conditioned covariance matrices, resulting in extremely slow convergence. The RLS algorithm is more computationally intensive and may also have poorly conditioned covariance matrices leading to unstable convergence. Therefore, in this paper, we employ the batch processing LS algorithm to estimate model coefficients.
The core idea of the LS algorithm is to search for the best function of the data for matching by minimizing the sum of squares of the errors. By replacing x(n − l) and y GMP (n) of Equation (13) withŷ GMP (n − l) and z(n), respectively, we can write the equation as where where K a , L a , K b , L b , M b , K c , L c , M c are the parameters of the GMP model, and the LS solution of Equation (14) isĉ where (·) H denotes the complex conjugate transpose of the matrix, andĉ is the estimated GMP coefficients (including {a kl , b klm , c klm }). With such an approach, the APA and DPD coefficients of the GMP model can be obtained based on the corresponding input and output data.
In addition, it is worth noting that the GMP model has as many as eight parameters, while the MP model has only two. Parameter selection is usually performed scan-by-scan to obtain a set of parameters to achieve a trade-off between the modeling error and model size. This approach is appropriate for the MP model, but for the GMP model with eight parameters, it takes a lot of combinations and time to obtain the proper set of parameters. There have been studies, such as the simulated annealing (SA) method, the particle swarm optimization with the Akaike information criterion (PSO-AIC) method [40], to develop fast and efficient strategies to determine the optimal parameters, allowing a suitable balance of performance and computing complexity.

Experimental Validation and Results Analysis
In this section, the experimental system is presented, followed by the determination of the training sequence length and the analysis of the JSG-DPD results for two power levels.

Experimental Setup
The structural block diagrams of the over-the-air (OTA) measurement equipment and the actual laboratory equipment photo are presented in Figures 7 and 8, respectively. The 5G new radio (NR) intermediate frequency (IF) signal with a bandwidth of 100 MHz centered at 3 GHz is generated from a vector signal generator (VSG)(R&S SMBV100B). The Agilent E3247C outputs an unmodulated signal of 12.5 GHz, which is multiplied to 25 GHz by a MITEQ-MAX2M200400 frequency multiplier, and then sent to a power divider as a local oscillator (LO) signal for the up-converter (KTX321840) and down-converter (KRX321840). The 28 GHz bandpass filtered signal is linearly amplified by a pre-amplifier (Ducommum APH-26063325) away from its saturation region to drive the APA into its compression state. The APA device is based on the Amotech AAiPK428GC-A0404, which consists of four Anokiwave AWMF-0158 transceivers and incorporates 16 branches attenuators, phase shifters, and PAs in a 4-by-4 APA, as well as 16 patch antennas. The main beam signal is captured by an observation horn antenna. After down-conversion to IF, it is collected and converted to a baseband signal by an R&S FSW spectrum analyzer. Lastly, the baseband signal is processed by MATLAB.

Experimental Results and Analysis
Generally, to obtain higher drain efficiency, the PA is required to operate in 2-3 dB gain compression; thus, we choose two input power levels of −24 dBm and −25 dBm from VSG for testing. The nonlinear characteristics of such points are meaningful and significantly impact the quality of the transmitted signal without DPD techniques. For the JSG-DPD technique proposed in this paper, it is necessary to trade-off the model performance and the frame length occupation to select a suitable length of training sequences to model and linearize the APA. Therefore, we first conduct experiments in the noiseless scenario to determine the length of training sequences by observing the improvement of ACPR and EVM. The expression of ACPR and EVM are as follows ACPR (dB) = 10 lg P adj_n_L (w) + P adj_n_R (w) where P adj_n_L (w), P adj_n_R (w) are the signal power of the left channel and the right channel adjacent to the main channel, respectively, and P ch_main (w) is the power of the signal of the main channel (transmit channel).
where the vector error signal e n = (Ĩ n − I n ) 2 + (Q n − Q n ) 2 , I n , Q n are the expected signal's quadrature components, andĨ n ,Q n are the two quadrature signals of the real measurement, respectively. Since this part of the test is performed at the ground station (before the satellite launch), the prior knowledge about APA and an almost noise-free experimental scenario can be obtained. The experimental results are shown in Figure 9. For these two input power levels, the improvement of both ACPR and EVM is optimal for a training sequence length of 40,000. Although the EVM improvement may be better after the training length exceeds 100,000, an extra-long frame length is needed to obtain a slight improvement. Especially after hundreds or thousands of superpositions, the slight performance improvement is negligible in comparison to the cost of the frame length. Hence, the training sequence length is chosen to be 40,000 after a trade-off. In order to verify the effect of AWGN on the transmitted signal without loss of generality, an SNR of 10 dB is assumed for each satellite transit to establish communication with the ground station.
In actual communication scenarios, as shown in Figure 10, when the angle between the satellite and the ground station is 5 degrees from the horizontal, the communication is established, and the communication distance is the longest and the SNR is the smallest. It is at this moment that our proposed method starts to operate. The relative position between transceivers will change, which will lead to fluctuations in SNR. The SNR variation throughout the communication window may exceed 10 dB [41], but our proposed method is only used when communication is just established. The SNR fluctuation range is approximately 10 dB to 10.5 dB, which is not a large enough fluctuation to affect the experimental results. Therefore, for the convenience of the calculation, we assume a constant SNR of 10 dB.

Orbit
Ground station Satellite 5°5°C ommunication Window The window occupied by the proposed method The unprocessed received signal at this point is shown in Figure 11a, and the critical sidelobe information of the APA nonlinear system is completely drowned in the noise. After 10, 100, and 1000 superimposed training sequences, the sidelobe information is gradually revealed. As the time domain signal results shown in Figure 11b, the noise reduction processed signal and the original signal almost overlap. Hence, the nonlinear characteristics of the APA system can be extracted.
According to the DPD structure introduced in the previous sections, the GMP model is applied to model the APA system and generate the predistorted signal. Figure 12 shows the AM-AM and AM-PM curves of this APA system. The predistorted signal forms an inverse relationship with the original APA output signal, which makes the combined pre-distorter and the APA approximately linear. The time-domain signal comparison in Figure 13 reveals that the most effective strategy of the DPD technique is to pre-amplify the signal at high amplitude so that after the nonlinear APA gain compression, the same linear gain can be achieved as at other low amplitude. Empirically and based on several attempts to obtain the minimum normalized mean square error (NMSE) where y(n) is the actual output of the APA, andȳ(n) is the output of the APA obtained by model fitting. Thus, the optimal parameters selected for the GMP model are 3, 5, 2, 3, 2, 3, 2, 2, which represent the nonlinear order and memory depth of the MP sub-model, lagging envelope term, and leading envelope term, as well as the additional degrees of freedom of the lagging and leading term. In this paper, experiments were also conducted on the MP model for comparison. For fairness, the nonlinear order and memory depth were chosen as three and five, respectively, which are consistent with the MP sub-model of the GMP model. As can be seen from the results in Figure 14 and Table 1, for the case of stronger nonlinearity (input power level: −24 dBm), the application of the proposed JSG-DPD method improves the upper and lower ACPR average performance by nearly 4 dB for a training sequence superposition of 1000 or 100 times with the GMP model, especially the EVM performance by more than 7.5%, which undoubtedly enhances the quality of the transmitted signal significantly. The ground station is more concerned about the improvement status of EVM than ACPR. When the number of training sequence superposition is 10, there is also an improvement of nearly 6.5% in EVM, which is still promising. In contrast, the results without the STS method improved the EVM slightly. In addition, for the results of the MP model, the improvement in EVM is not as good as that of the GMP model, with a difference of about 0.2% to 0.4%. Nevertheless, it is noteworthy that using the MP model without the STS method still improves the ACPR by 2.5 dB, while the ACPR results with the GMP model without the STS method have deteriorated to the extent that it seriously affects the signal transmission in the adjacent channels. For the case of slightly weaker nonlinearity (input power level: −25 dBm), applying the proposed JSG-DPD method, the EVM performance is improved by 6.3% to 6.9% for the training sequence superpositions of 1000, 100, and 10 times with the GMP model. For the results of the MP model, the improvement of EVM is not as good as that of the GMP model, and the difference is about 0.2-0.8%. Moreover, the MP model is still more robust than the GMP model for ACPR improvement, which might depend on its low number of parameters and insensitivity to low SNR responses. In summary, the problem of low SNR due to AWGN channels can be effectively solved after 1000 or 100 times of superimposed training sequences. For the 10 to 15 min of LEO satellite transit, the pilot sequence length of 40,000 times 1000 or 100 is acceptable, accounting for less than 1 min. More critically, it is not necessary to re-adjust the DPD coefficients for each transit. However, only when the communication quality degrades significantly and the BER of the ground station crosses the preset threshold. It is at this point that the DPD coefficients are re-adjusted based on the operational status of the spaceborne APA using the proposed JSG-DPD technique to restore the normal transmission state, allowing the linearity and power efficiency of the space-borne APA to be enhanced without taking any additional space-borne resources. Although the STS method used in this paper to improve SNR appears simple, it validates the feasibility of the proposed JSG-DPD technique. There is still a lot of room for further exploration and performance improvement, such as adopting deep learning approaches.

Conclusions
In this paper, a joint satellite-transmitter and ground-receiver (JSG) digital pre-distortion (DPD) (JSG-DPD) technique is proposed to linearize a low Earth orbit (LEO) space-borne active phased array (APA). The method takes the approach of DPD coefficients extraction at the ground station, which saves a complete set of feedback loop at the satellite-transmitter side compared to the conventional DPD techniques. The issue of sidelobe information drowning in the additive white Gaussian noise (AWGN) is addressed by employing the superimposed training sequences (STS) method. The proposed method is experimentally verified to improve the adjacent channel power ratio (ACPR) by 4 dB, especially the error vector amplitude (EVM) greatly by 7.5%. In addition, the performance difference between the GMP model and the MP model is analytically compared. The proposed technique provides a feasible means for the linearization of space-borne power amplifiers (PAs) or APAs. In the future, we will address the problem of accurate synchronization of training sequences and seek opportunities for in-orbit validation in actual LEO satellite communication projects.