Blind Turbo Equalization of Short CPM Bursts for UAV-Aided Internet of Things

With the surge of Internet of Things (IoT) applications using unmanned aerial vehicles (UAVs), there is a huge demand for an excellent complexity/power efficiency trade-off and channel fading resistance at the physical layer. In this paper, we consider the blind equalization of short-continuous-phase-modulated (CPM) burst for UAV-aided IoT. To solve the problems of the high complexity and poor convergence of short-burst CPM blind equalization, a novel turbo blind equalization algorithm is proposed based on establishing a new expectation–maximization Viterbi (EMV) algorithm and turbo scheme. Firstly, a low complexity blind equalization algorithm is obtained by applying the soft-output Lazy Viterbi algorithm within the EM algorithm iteration. Furthermore, a set of initializers that achieves a high global convergence probability is designed by the blind channel-acquisition (BCA) method. Meanwhile, a soft information iterative process is used to improve the system performance. Finally, the convergence, bit error rate, and real-time performance of iterative detection can be further improved effectively by using improved exchange methods of extrinsic information and the stopping criterion. The analysis and simulation results show that the proposed algorithm achieves a good blind equalization performance and low complexity.


Introduction
Driven by the explosive surge of Internet of Things (IoT) services for sixth-generation (6G) mobile communications systems, different new 6G use cases have been proposed and are under intensive research discussion recently, such as IoT industry automation, maritime machine-type communication networks, and other applications [1,2]. As one of the key technologies to achieve the vision of the Internet of Everything, UAVs have been widely used to perform diversified tasks [3][4][5] due to their low cost and flexible deployment.
There has been a recent surge of studies on the use of UAVs for IoT communication [6][7][8], such as data collection [9,10] and mobile edge computing [11]. However, with the dramatic increase in the number of connected machines, the number of IoT devices deployed worldwide is expected to grow to 75.4 billion by 2025 [12]. There is a growing demand for low complexity and high power efficiency in UAV-aided IoT communication due to the limited payload of the devices.
Continuous phase modulation (CPM) is suitable for power-and bandwidth-limited systems because of its good spectral efficiency and its higher power efficiency relative to linear modulations with comparable spectral efficiency. Moreover, the constant envelope property of CPM allows the nonlinear power amplifier (PA) to be operated at a high efficiency, which further increases the power efficiency of the system [13]. For batterypowered IoT nodes and UAVs, energy efficiency and cost are key factors because these devices are difficult to recharge or recycle once depleted. Therefore, CPM is one of the preferred modulation schemes in UAV-aided IoT communications systems due to its favorable low power consumption, which can greatly increase the life of terminal devices. However, CPM transmission over multipath fading channels is a challenging task due to the high computational complexity in the receiver. If the design of the waveform is poor, it will reduce the overall power of the communications systems, even offsetting the increased power efficiency achieved by the PA. Therefore, we focus our attention on the receiver design at the physical layer for CPM over frequency-selective channels employing low data rates and short bursty transmissions, which is a fundamental tool to implement UAV-aided IoT. In general, the main contributions of this paper include the following: • We combed the literature related to CPM and summarized it in the Table 1. • To meet the demands of low data rates and short-burst transmission scenarios of the UAV-aided IoT system, a short burst structure of CPM is designed in this paper, and a link-level simulation platform of the communications system is established on this basis. • A low complexity approach for soft-input soft-output (SISO) blind equalization is proposed to achieve a fast and accurate blind equalizer in the UAV-aided IoT system. The first step utilizes the soft-output Lazy Viterbi algorithm instead of the Viterbi algorithm to perform the expectation step and obtain a low complexity expectationmaximization Lazy Viterbi algorithm (EMLVA), while the second step applies the BCA method to establish a set of initializers, denoted as the BCA initializers, which achieves a high global convergence probability. • The blind turbo equalization for short-burst CPM is proposed based on the new SISO blind equalization with iterative detection, where the blind equalizer and decoder exchange extrinsic information in the form of log-likelihood ratios (LLRs). To further improve the convergence of iteration and reduce the average iteration number, the decision-aided (HDA) algorithm based on weighted extrinsic information exchange is proposed. • The blind turbo equalization based on EMLVA is proposed and evaluated on a linklevel simulation platform. Simulation results show that EMLVA can obtain a good trade-off between complexity and BER performance. When the HDA with weighted extrinsic information is applied, the convergence of iterative detection and real-time performance can be further improved.
The rest of this paper is organized as follows. Section 2 provides related work on the channel estimation and equalization of CPM transmission over frequency-selective channels. In Section 3, the burst structure, communications system, and channel model are designed and formulated for typical UAV-aided IoT communications scenarios. The new low-complexity blind equalization based on EMLVA, as well as the turbo scheme are introduced and described in Section 4. The performances of the proposed turbo blind equalizer are evaluated and analyzed based on the link-level simulation platform in Section 5. In Section 6, we discuss the application prospects of reflecting intelligent surfaces (RISs) in UAV-aided IoT communications systems. Finally, the paper is concluded in Section 7. Table 1. Summary of the related work.

Ref. Contribution/Methodology
Data-aided [14] A generalized pilot symbol-aided demodulation method is proposed in a flat fading channel. The optimal filters for channel estimation are also presented.
[15] The estimate of the channel is realized by local B-splines.
Statistics [16] The second-order statistics of the signal for channel estimation is extracted for CPM by TXK.
[17] A fourth-order cross-cumulant matrix is extracted by the eigenvector method. Adaptive equalization [19] A forward adaptive SISO that considers the channel correlation in only one direction is proposed for MSK.
[20] A variety of the reduced state FA SISO is proposed.
[21] The thresholds of the RS-A-SISO algorithms are obtained by the density evolution technique.
[23] Derivation of the generalized forward/backward adaptive algorithm.
HMM [24] The BBW algorithm, as well as two variants, are proposed for CPM.
[25] A stochastic ML blind channel estimation is developed, and an approximate Cramér-Rao bound for CPM is derived.
[26] The Viterbi algorithm is applied within the EM algorithm.
FDE [27] The single-carrier frequency-domain equalization is used in the CPM signal for the first time.
[28] Laurent decomposition is used to realize traditional equalization (linear and decision feedback) and turbo equalization in the frequency domain.

Related Work
In practical communications, the signals are transmitted over the fading channel and the channel response is unknown. In [14], a generalized pilot symbol-aided demodulation method based on the idea of inserting data-dependent symbols periodically was proposed for CPM in a flat Rayleigh fading channel. An optimal front-end filter was developed based on the mean-squared error (MSE) in the channel estimation process. Then, the channel estimates generated by the interpolation filter, together with the received signal, are input into a coherent CPM demodulator using the Viterbi algorithm. In frequency-flat fast-fading channels, Ref. [15] provided a data-aided channel estimation algorithm with local B-splines, and the results showed that there exists a minimum sampling interval proportional to the normalized fading rate for pilot insertion. However, when short bursts are considered, the data-aided channel estimation method can significantly increase the overhead-to-payload ratio. Similarly, low-complexity frequency-domain equalization for CPM [27][28][29] requires the addition of a cyclic prefix or unique words, which can also increase the overhead-to-payload ratio for short bursts.
As an alternative, blind channel equalization can recover the signal directly, without a training sequence. The author in [16] applied the Tong-Xu-Kailath algorithm to CPM by extracting the second-order statistics of the signal for channel estimation. The eigenvector method was used to identify the channel from a fourth-order cross-cumulant matrix under the GSM channel in [17], combined with turbo estimation. However, when applied to a low number of symbols, the statistical moments did not provide accurate channel estimation. In [18,30], the author developed a nonlinear signal model for GMSK rather than the conventional finite impulse response model. The information symbols were obtained by Bayesian inference based on Markov chain Monte Carlo (MCMC) with implicit channel estimation.
CPM and the multipath channel can construct a joint trellis, which can be represented by a finite state machine (FSM). Therefore, a forward adaptive SISO (FA-SISO) [19], which considers the channel correlation in only one direction, was proposed for MSK, which replaces the unknown channel by the least-mean-squared error for each hypothesis branch symbol. Then, due to the high complexity of FA-SISO, the author in [20] proposed various reduced-state A-SISO (RS-A-SISO) algorithms for complexity reduction at the same time. The thresholds of the RS-A-SISO algorithms were obtained by the density evolution technique in [21]. Another structure is the forward/backward adaptive algorithm. An exact expression for the soft metrics was derived when the unknown parameter was modeled as a Gauss-Markov process in [22], which can be estimated iteratively by the Kalman filter. The author in [23] employed the concept of bidirectional estimation in [22] and derived a generalized a posteriori probability of soft branch metrics.
The FSM can also be described by a hidden Markov model (HMM), and the Baum-Welch (BW)/EM algorithm allows for great likelihood estimation of the unknown parameters in the HMM. The batch-BW (BBW) algorithm, as well as two variants were proposed by Carles [24] for time-invariant channels. One is to split the received signal into several sub-blocks, producing different channel estimates in each, called the segmented batch-BW (SBBW) algorithm. However, the variant needs to avoid over-fragmentation because of a poor estimate from fewer data. An alternative algorithm called time-dependent BW (TDBW) was derived by introducing some linear constraints emerging from a linear FIR hypothesis on the channel. The author in [25] proposed an improved Baum-Welch algorithm to directly estimate the channel parameters, avoiding over-parameterization in the estimation problem. In [26], an algorithm for joint channel estimation and equalization by applying the Viterbi algorithm within an EM iteration was introduced, which was used to implement the E-step. However, the major drawbacks among the works cited above are relatively poor convergence with an inappropriate initializer and high complexity.
CPM serves as one of the preferred modulation schemes for the transmission of low data rates in the IoT uplink, suffering from the high complexity and poor convergence of the channel estimation at the receiver. Therefore, the paper proposes a low-complexity blind equalization algorithm for short-burst CPM signals based on the HMM. The proposed blind equalizer significantly outperforms the traditional one in complexity, while keeping a similar BER performance, which helps the device achieve online real-time detection. In general, as the spectrum resources are limited and the number of connected devices is increasing day by day, CPM is a promising modulation scheme, which is suitable for battery-powered devices and is expected to play an important role in the physical layer design of UAV-aided IoT communications.

Communications System Model
The communications system with blind turbo equalization is represented in Figure 1. It can be divided into three parts: transmitter, channel model, and blind turbo equalization: (1) Transmitter: A sequence of message bits m = (m 0 , m 1 , m 2 , ..., m K ) is encoded with the channel encoder into codeword b = (b 0 , b 1 , b 2 , ..., b V ) and interleaved as u = (u 0 , u 1 , u 2 , ..., u V ). Then, the block codewords are input to the M-ary CPM modulation module, which consists of the continuous-phase encoder (CPE) and memoryless modulator (MM). The baseband representation of CPM is written as where E s and T represent the symbol energy and symbol duration, respectively. α n is the CPM symbol sequence, containing n + 1 symbols. The information-bearing phase is defined as h is the modulation index, defined as the quotient of two relatively prime integers. q(t) is the phase pulse obtained by the integral of the phase pulse g(t). L is the length of the frequency pulse, where L = 1 is the full-response CPM and L > 1 is the partial response.
(2) Channel: The CPM transmitted over the channel is affected by the fading channel and Gaussian white noise; the continuous representation of the received signal is In digital processing, it is necessary to use a discrete-time representation. The sampling time is noted as T s = T/ε, where ε is the number of sampling numbers per symbol. Therefore, the discrete-time representation of CPM, the received signal, and the noise can be defined as: The CPM signal for the ith path is defined as where 0 i is a row vector containing i zero elements and discrete channel length where C n×l represents an nbyl complex matrix. The channel response is c = [c 0 , c 1 , ..., c N c ] T , and then, the discrete-time representation of the received signal in the form of a matrix can be written as r n = S n (α n )c + w n .
(3) Channel estimation and equalization: The CPM and multipath channel can build a joint trellis, which can be regarded as an FSM. According to Bayesian theory, the posterior probability is Since the receiver does not have a priori information about the channel, the probability density function p(c) can be regarded as a constant. The received signal is known at the receiver, and as a result, p(r) is also a constant. Therefore, we can write p(α, c|r) ∼ p(r|α, c)p(α), where With the perfect channel state information (CSI), maximum likelihood (ML) detection requires maximizing Equation (11). However, in practice, the channel response is unknown to the receiver and channel estimation is required.
The FSM can be described by a hidden Markov model because the input symbols are independent of each other. Therefore, the channel estimation and equalization for CPM can be transferred to the three classical problems of the hidden Markov model (HMM) to solve, where channel estimation can be regarded as the parameter learning problem of the HMM and channel equalization can be regarded as the decoding problem of the HMM.
(4) Turbo equalization: The receiver can be used in the form of turbo equalization because of the SISO Viterbi-like algorithm. The iterative exchange of information is repeated iteratively between the blind equalizer and the decoder for the same set of the received signal. The received signal r and the soft information L D (b t ) output from the decoder (initially set to 0) are input into the SISO blind equalizer, and then, the blind equalizer outputs the extrinsic LLR on the symbols L E (α k |r), which is fed to the decoder after interleaving. The decoder provides an estimation of the information bitsm using soft decisions and computes an extrinsic LLR on the coded bits L D (b t ).

Design of Burst Structure
The definition of the short in short bursts varies from a few tens to hundreds of bits in different literature [31]. The literature [32] designed a short convolutional code for machine-type communications, where the information word length was 64 bits and the codeword length was 128 bits. In [31], the authors considered a scenario with only a few tens of bits over the duration of a burst. The number of transmitted CPM symbols in a single burst is was symbols with a burst duration of 1 ms and a very high-frequency baud rate in [33]. The burst structure is shown in Figure 2, with the training sequence using only a small number of bits to quickly and reliably lockthe receiver, rather than for channel estimation in the conventional one. It is particularly important to note that the BER results in the simulation are a function of the average (information) bit-energy-to-noise-power density spectral ratio E b /N 0 . The following relationship holds where E s is the average symbol energy, M is the modulation base number of CPM, and R c is the code rate. A data-aided approach is used as a benchmark scheme in the subsequent simulations, with training bits for channel estimation and η tr = N tr /N all representing the power efficiency loss caused by the training bits.

Channel Model
For short bursts, a time-invariant channel was assumed throughout a burst when the UAV flies at medium and low speeds, and the channel in any two different bursts is independent. A typical communications scenario is shown in Figure 3. Considering the effects of topography, atmosphere, and other factors, the propagation shows the multipath fading problem. In a ground-to-air scenario for UAV-aided IoT communications, the UAV adopts the hover-and-fly method for data collection via the IoT uplink. In the communication between the UAV and the ground nodes, the multipath channel model can be described by a Ricean fading channel with the probability distribution of the Rician distribution: where r is the envelope of the fading signal, α LOS is the amplitude of the LOS component, σ 2 represents the energy of the NLOS component, and I 0 is a modified Bessel function of the first kind with order zero. Rician factor K Ricean is defined as the power ratio of the main signal to the multipath components, where K Ricean = α 2 LOS /σ 2 . In the IoT-aided UAV communications scenario, this paper refers to [34] for the air-ground (AG) channel at 5060 MHz (C baud), which corresponds to a Ricean factor of 29 dB. Then, the two-ray Rician channel model, including the LOS and NOLS components, can be specified as

EM Algorithm
The EM algorithm is divided into two steps, one of which is the expectation step (E-step) and the other is the maximization step (M step). For a blind equalization problem, the observations r and transmitted symbols α can be written as a new data vector z = (r, α). The EM algorithm then consists of the following iteration. E-step: Compute the mean log-likelihood function: Q(θ|θ (m) ) = E a [ln f r n (r n |α, θ)|r n , θ (m) ].
where f (·) represents a conditional probability density. M step: Compute new estimates of parameters:

VA and Its Variants
The Viterbi algorithm (VA) was proposed in 1967 for maximum likelihood (ML) detection of convolutional codes, searching for the ML paths by dynamic programming. At each symbol duration, the path metrics of all paths entering the current state are compared by the VA, and the path with the largest metric is selected for that state, called the survivor path. Finally, the VA outputs the information bits corresponding to the survivor path, which is named the ML path. Thus, the computational complexity of the VA is O (M L N), where M is the number of states of the finite-state machine, L is the memory length, and N is the length of the sequence. When the number of states in the trellis is too large, searching on the original full trellis for ML paths requires a large number of computational resources. It is not always necessary to traverse the full trellis when only the ML path is searched, especially in the case of a high signal-to-noise ratio.
Before the VA, sequential decoding algorithms are used in decoding the convolutional code, which are essentially greedy algorithms. They only search for branches that are possible to become the ML path, which has low complexity, but cannot guarantee finding the ML path. Therefore, a variant of the VA algorithm, the Lazy VA algorithm [35,36], was developed. A priority queue (PQ) data structure is introduced to ensure finding ML paths. Compared to the standard VA, the Lazy VA is far better than the Viterbi algorithm in complexity and no worse than the VA in the worst case. For the ease of writing, the VA and Lazy VA are denoted as Viterbi-like algorithms in this paper, and the Lazy Viterbi algorithm is summarized in Algorithm 1. [36]. 1: The trellis is set to empty, and the PQ contains the initial node n s with initialized cumulative metric acc(n s ) = 0. 2: Pop the top node n 1 of PQ. 3: if n 1 is the same as some node of the trellis then 4: Discard n 1 5: else if n 1 with the smallest metric in the PQ then 6: Output as the current node 7: else if n 1 is not the last node then 8: Insert its successors into PQ, and return to Step 2 9: else 10: Trace back the ML path and output hard/soft decision bits 11: end if

The EMVA/EMLVA Blind Equalizer
With these definitions in mind, the Lazy VA in Algorithm 1 is used to implement the E-step for the EM algorithm. The resulting algorithm is denoted as the expectationmaximization Lazy VA (EMLVA) soft blind equalizer, and it can be found in Algorithm 2, which can be implemented as follows: Calculate the probability of each path by (18) 7: Compute the expectation of the logarithmic likelihood function by (19) 8: Maximize the expectation and compute the estimated parameters by (21) and (22) where Λ (s) q is the path metric for the qth survivor path, η (s) is a constant to prevent these metrics from becoming too large, which is usually selected as the minimum path metric, and the normalizing factor z (s) satisfies ∑ q P (s) q = 1.
(3) Obtain the complete data expected log-likelihood function: where d is a constant independent of the estimated parameter θ.
(4) Maximize the likelihood function for θ = (c, N 0 ): where (5) With the values (c (S) , N 0 (S) ) obtained after the EMV algorithm, make soft decisions as to which sequence of channel symbols or output soft information in input into the decoder.
Similarly, the standard VA used to implement the E-step for the EM algorithm is referred to as EMVA. A block diagram of the EMLVA/EMVA equalizer is included in Figure 4, where the grey boxes represent the initializer.

BCA Method and Convergence Criterion
Since the EM algorithm is sensitive to the initializer and the channel is different in any two different bursts, a single fixed initializer has a poor convergence, which falls into an initial value trap, making it difficult to track the channel. Therefore, a set of initializers based on the BCA method [37] is used in this paper. For complex channel response, global convergence is ensured with a high probability if the initializer contains only one nonzero unit real-valued tap and one nonzero unit purely imaginary tap, located at the appropriate locations. Therefore, for a complex channel response of length l, there are [(2l + 1)] 2 initializers for a set, and the number is reduced to (2l + 1) for a real-valued channel. The set of initializers is denoted as the BCA initializer, and the other is called a single initializer.
When the set of initializers has been traversed, a set of estimated parameters is obtained, and then, the optimal estimated parameters need to be selected. For the kth initializer, the EMV equalizer converges toθ k = (ĉ,N 0 ) k , and the evaluation of the likelihood function can be expressed as where The best estimated parameters can be selected by maximizing L(θ k ), i.e., θ = θˆk,k = argmaxL(θ k ). (27) 4.5. The Turbo EMLVA Blind Equalizer and Positive Feedback Figure 4b shows that the detector can further improve performance through turbo equalization in Section 3.1, which can effectively improve the bit error rate performance and convergence of the algorithm. A turbo equalization EMV algorithm based on the BCA initializer is presented in Algorithm 3. To distinguish the iteration of turbo equalization from the iteration of the EM algorithm, the turbo equalization between CPM-SISO and Decoder-SISO is denoted as the outer iteration, and the iteration of the EM algorithm is called the inner iteration. The algorithm is as follows:
1: Input: the received signal r 2: Set the maximum outer iteration equalization times T 3: for t = 0 to T do 4: With the current initialization to θ (0) = (c (0) , N 0 (0) ), run Algorithm 2 5: Compute an estimate to the extrinsic LLRs L E (b t |(r)). Feed L E (b t |(r)) to the channel decoder 6: From the channel decoder per-bit soft-output, recompute LLRs for each symbol L D (α k ) 7: end for 8 (4) After traversing all the initializers, the optimal estimated parameter is selected by the convergence criterion in Equation (27) for the blind equalization, and the corresponding soft symbol information L E (α) is demapped and deinterleaved to obtain the soft information L E (b t ) of the information bits, which is input into the channel decoder as a priori information.
(5) When T < T max , an extrinsic LLR on the coded bits L D (b) is mapped and interleaved again and delivered back to the CPM blind equalization as the updated a priori probability. Steps 3 to 5 are repeated for a given maximum number of iterations T max .
For turbo equalization, the inner iteration using different Viterbi algorithms is denoted as T-EMVA and T-EMLVA, respectively. It should be noted that under the condition of a short burst, the coded CPM system exhibits positive feedback in the process of outer iteration and convergence to a suboptimal solution by directly exchanging extrinsic information. Therefore, extrinsic information exchange methods play an important role in the convergence of the outer iteration, and the performance of equalization can be further improved by superior extrinsic information exchange methods.

Complexity Analysis
The computational complexity of the EMV based on the BCA initializer mainly comes from the Viterbi-like algorithm of the E-step. Only considering this part, the complexity of the EMVA is O([2(D + 1) 2 SpM L+L c −1 N]), and the complexity after adding the outer iteration of turbo equalization is O([2(D + 1)] 2 SpM L+L c −1 N + TSM L+L c −1 N), where D is the channel memory length, L c is the lengthin terms of the symbol time, L is the memory length of the CPM frequency pulse, M is the base number of CPM, S is the maximum number of inner iterations, T is the maximum number of external iterations, and N is the sequence length.
To reduce the calculation amount, the EMLVA can be implemented by applying the Lazy VA in the E-step. In the best case, the complexity can be reduced to O([2(D + 1)] 2 SN), and T-EMLVA has complexity of O([2(D + 1)] 2 SN + TSN). After adding the iteration stopping criterion, the number of outer iterations can be further reduced and the decoding delay can be effectively reduced. The average number of iterations of turbo equalization after adding the stopping criterion can be denoted as T ave , and the computational amount is O([2(D + 1)] 2 SN + T ave SN). A detailed comparison of the complexity of the T-EMVA and the T-EMLVA is included in Table 2. Table 2. Complexity comparison of the algorithms.

Experimental Setup
For the evaluation of the proposed blind turbo equalization, the parameters are summarized in Table 3 and the burst structure is shown in Figure 2. GMSK with the modulation index h = 1/2 was taken as an example for the experiment and simulation. The channel was modeled as a two-ray Ricean channel model, which set the maximum delay of 7 µs; to better simulate it in the hilly/mountainous scenarios for UAV-aided IoT systems, the Ricean factor was chosen as 29 dB. The sample time with a baud rate of 150 kHz was 3.3 µs, and then, the symbol duration was T = 6.6 µs. For the maximum delay 7 µs, the channel length at the receiver was D + 1 = 7/3.3 + 1 = 3 taps, while the channel length in terms of symbol duration was L c = D/ε + 1 = 2; the sign represents rounding down. Outer iteration (T) 10 T max

System Parameter Optimization
A serially concatenated CPM (SCCPM) setup with iterative decoding was considered in the numerical simulations. In view of the insufficiency of interleaving for the short burst, the optimized parameters of the convolutional code (CC) and interleaver need to be selected.
The number of states and the free distance of a CC can affect the interleaving gain and the performance of the iterative detection. In general, a CC with a larger number of states has a larger free distance and can further provide more interleaving gain. However, this only occurs when the interleaving length is large enough and the signal-to-noise ratio is above the convergence threshold. When the frame length is short, a CC with a large number of states exhibits poor convergence, leading to performance degradation. Therefore, Figure 5a compares the BER performance of five types of CC, i.e., (5,7), (17,15), (23,35), (53,75), and (171,131) after 0 outer iterations and 10 outer iterations. The result shows that (5,7) CC is the optimal channel code for the SCCPM system, taking into account the burst length, signal-to-noise ratio, and implementation complexity.
The interleaver was used to randomize the order of the code bits before transmission. To select the appropriate interleaver to enhance the performance gain, Figure 5b compares the performance of six types of interleavers: matrix interleaver, random interleaver, S-random interleaver, general interleaver, QPP interleaver, and WCDMA interleaver. The analysis of the BER curves for 0 and 10 outer iterations showed that the S-random interleaver can provide the best coding gain and performance. Therefore, the S-random interleaver was used for subsequent simulations.  Figure 6a compares the BER performance of the data-aided coherent detection, EMVA, and EMLVA proposed in this paper with different inner and outer iterations, as well as the SOVA with perfect channel state information (CSI). The BER curves with perfect channel knowledge represent the maximum attainable performance, while the curves for the data-aided method (called Tran.Bits.) represent the baseline to beat, which extracts the channel by using maximum likelihood estimation with training sequence lengths of 4, 6, or 8 bits, respectively. The performance of the data-aided method was lower than that of the blind equalization EMLVA proposed in this paper after 10 outer iterations, indicating that the data-aided approach has difficulty obtaining an accurate channel with few training bits. Compared to the traditional EMVA, the performance loss of the EMLVA was small, which can be ignored and approximates the curve with perfect channel knowledge after 10 outer iterations. As seen in Figure 6b, the BCA-initializer-based EMLVA can ensure good convergence of the blind equalization with a fixed inner iteration of three. More outer iterations did not improve the performance significantly after about five iterations, which can be viewed as convergence. On the contrary, when a single initializer was used, the BER curves decreased slowly as E b /N 0 , and there was not a considerable performance improvement in the iterative process. The performance loss can be explained by the poor convergence to track changing channels in short bursts. Figure 7 analyses the normalized-mean-squared error (NMSE) of the EMVA and EMLVA based on the BCA initializer, as well as the NMSE of the EMLVA using a single initializer and the NMSE based on the training bits method. It can be seen that the proposed EMLVA equalizer with the BCA initializer significantly outperformed the EMVA in complexity while keeping a similar estimation accuracy as the EMVA. Moreover, the NMSE of the EMLVA based on a single initializer had poor convergence, and there was no decreasing trend at high E b /N 0 . At the same time, the NMSE curves of the data-aided method with few training bits exhibited poor convergence and estimation accuracy, although there was a decreasing trend.

Improved Exchange Methods of Extrinsic Information and Stopping Criterion
Direct exchange is an original method, which can be further improved by weighted exchange. In weighted exchange, the extrinsic information L E is sent to the weighted function module before being delivered to the other SISO module. The weighted function is expressed as: In order to evaluate the effect of different weighting coefficients, the simulation was implemented with values of α ∈ [0.8, 1.0] and β ∈ [0.001, 0.01]. The BER performance with various weighted coefficient combinations of α and β is shown in Figure 8a-c. Figure 8d shows the BER as a function of E b /N 0 for different extrinsic information, including weighted extrinsic information, average extrinsic information, and direct methods, with or without the HDA stopping criterion. The results show that there was little loss of performance of the algorithm with HAD criterion and the combination of α = 0.9 and β = 0.01 had the best BER performance.

The Effect of Channel Length Overestimation on the Performance
Finally, the effect of channel length overestimation on the performance was investigated. Up till this point, the channel length was assumed known. However, in practice, this is not the case. The BER performance of the EMLVA with different initializer schemes is analyzed in Figure 9a when the assumed channel length is correct (N c = 0) and when it is overestimated (N c = 1, 2, 3). The results illustrate the apparent robustness of the EMLVA with the BCA initializer compared to the EMLVA with a single initializer.  Figure 9b shows the number of expansions performed by the EMLVA and EMVA as a function of E b /N 0 , where the channel overestimation length is expressed in symbol time. It is easy to see that the number of states traversed by the EMVA increased exponentially with the overestimation length, while the number of states searched by the EMLVA remained constant for high E b /N 0 . Even in the event of severe channel length overestimation, the complexity of the EMLVA can be decoupled from the memory length above a certain E b /N 0 . Therefore, Figure 9b shows the superiority of the EMLVA proposed in this paper in terms of complexity, which can meet the computational requirements of low complexity for IoT nodes. Average trellis nodes per symbol (b) Figure 9. The performance of the EMLVA when the channel length is overestimated. (a) BER as a function of E b /N 0 for the blind equalizer in a Ricean fading channel when the channel length at the receiver is known (N c = 0) and when it is overestimated (N c = 1, 2, 3). BER as a function of E b /N 0 when the channel length is overestimated. (b) Average number of expanded trellis nodes per information symbol as a function of E b /N 0 when the channel length at the receiver is known (L c = 0) and when it is overestimated in terms of the symbol time (L c = 1, 2, 3). Average trellis nodes per symbol as a function of E b /N 0 when the channel length is overestimated.

Discussion
In this paper, we focused our attention on the physical layer design of a UAV-aided IoT communications system. In recent years, an emerging and revolutionizing technology, RIS, can significantly improve communication performance by smartly reconfiguring the wireless propagation environment [38].
In the non-LOS scenarios of UAV-aided IoT, the RIS can be applied to maximize the received power of the user to keep the connection. In the presence of eavesdroppers, the reflected signal by the RIS can be tuned to cancel out the signal from the sensors node at the eavesdropper by smartly adjusting the reflection coefficients [39]. Apart from this, when the RIS is applied in the UAV-aided IoT communications system, the structure of the transmitter and receiver can be simplified, further meeting the requirements of low power and cost [40]. Predictably, the application of the RIS to future UAV-aided IoT communications systems will fundamentally change their architecture and significantly improve their performance.

Conclusions
In this paper, the EMLVA based on the HMM was proposed as an efficient approach to blind equalization of short CPM bursts in a UAV-aided IoT communications scenario. The proposed method significantly outperformed the HMM blind equalization based on the standard Viterbi algorithm in complexity, while keeping a similar BER performance as the conventional one. An initializer was developed based on the BCA method, which ensured good convergence of channel estimation, called the BCA initializer. The EMLVA equalization can be further improved by using the decoder output to propose the turbo EMLVA equalization. Considering the convergence and relatively large decoding delay in outer turbo iterations, weighted extrinsic information with the HDA stopping criterion is proposed for iterative detection. The simulation results showed that the proposed blind turbo equalization achieved an excellent trade-off between complexity and performance, verifying its advantages and practical values against other conventional methods.
Author Contributions: Writing-original draft preparation, Z.P.; writing-review and editing, C.X., H.W., Y.W. and D.G.; visualization, Z.P.; Methodology, D.G. All authors have read and agreed to the published version of the manuscript.