Digital Self-Interference Canceler with Joint Channel Estimator for Simultaneous Transmit and Receive System

Simultaneous transmit and receive wireless communications have been highlighted for their potential to double the spectral efficiency. However, it is necessary to mitigate self-interference (SI). Considering both the SI channel and remote transmission (RT) channel need to be estimated before equalizing the received signal, we propose two adaptive algorithms for linear and nonlinear self-interference cancellation (SIC), based on a multi-layered joint channel estimator structure. The proposed algorithms estimate the RT channel while performing SIC, and the multi-layered structure ensures improved performance across various interference-to-signal ratios. The M-estimate function enhances the robustness of the algorithm, allowing it to converge even when affected by impulsive noise. For nonlinear SIC, this paper introduces an adaptive algorithm based on generalized Hammerstein polynomial basis functions. The simulation results indicate that this approach achieves a better convergence speed and normalized mean squared difference compared to existing SIC methods, leading to a lower system bit error rate.


Introduction
Full duplex (FD) relay technology enables a communication system to simultaneously transmit and receive signals on the same frequency through a relay node [1].Inspired by this technology, simultaneous transmit and receive (STAR) technology, also referred to as in-band full-duplex (IBFD), has emerged as a novel innovation within the field.It aims to allow communication users to directly transmit and receive signals at the same carrier frequency, facilitating an even more efficient use of spectral resources.Moreover, STAR technology can be synergistically combined with other advanced technologies that enhance spectral utilization, such as reconfigurable intelligent surfaces [2], millimeter-wave technology [3], and multiple access techniques [4].As a result, STAR technology has arisen as a promising approach to increase data throughput and solve the problem of scarce spectrum resources, which can be applied in the field of wireless communications, radar, satellites, and unmanned aerial vehicles [5,6].
However, the self-interference (SI) signal, which is generated from the near end, can be 100-120 dB stronger than the weak received signal, thus making the receiver inoperable [7].Therefore, self-interference cancellation (SIC) technology is key to the implementation of STAR communications.At present, the SI signal can be eliminated in several stages and various domains, as shown in Figure 1.Typically, SIC is achieved through the cascaded antenna domain [8], analog domain [9], and digital domain [10][11][12][13][14][15].The antenna domain SIC is passive in nature, and mainly uses antenna isolation or related beamforming algorithms to prevent RF reception path blockage.In the analog domain, directly coupled interference suppression and digital-assisted analog interference-suppression methods are commonly used at the receiver low-noise amplifier (LNA) input before the analog-to-digital conversion.The last stage is digital SIC, which aims to eliminate the remaining SI from the received signal by reconstructing the SI signal in the receiving link and reducing its power below the noise floor.These steps have been applied in some prototypes and have demonstrated the potential of STAR communications [7,16].The remainder of this paper focuses on digital SIC algorithm design and both the linear and nonlinear SIC algorithms are evaluated through extensive simulations, demonstrating their effectiveness in improving the performance of STAR communication systems.Digital SIC has attracted significant attention in recent years to address the SI problem in STAR communication systems.Nonetheless, the practical application of digital SIC still faces many challenges.First of all, the observed SI signal exhibits significant frequency and time selectivity due to reflections within the device and its surrounding environment, which demonstrate temporal variations.This can be regarded as a classic system identification problem, which is typically solved through adaptive finite impulse response (FIR) filters.Secondly, in practical communication environments, there are often additional sources of noise, such as electromagnetic interference, lightning noise, radar signals, and other human-made disturbances.These types of noise typically exhibit strong pulse-like characteristics within very short time durations, which can lead to performance degradation of various filtering algorithms based on the assumption of Gaussian noise.Furthermore, the significantly higher power of the SI signal compared to the desired signal at the receiver implies that even minor distortions can significantly degrade the signal of interest.These hardware impairments, such as phase noise, in-phase/quadrature (I/Q) imbalance, and power amplifier (PA) and baseband nonlinearity, among others, with PA nonlinearity are especially harmful to the digital canceler, making digital SIC insufficient.
Digital cancellation algorithms accounting for impulsive noise have been proposed in earlier research [10,17].The work in [11,12] focused on linear SIC by employing the least mean square (LMS) adaptive filter.In addition, different linear SIC methods were compared in [13].To compensate for nonlinear PA distortion, the authors in [18] proposed novel predistorters and their parameter extraction algorithms.The proposed self-adaptive nonlinear digital cancelers in [14,15] both utilize a novel orthogonalization procedure for nonlinear basis functions, together with low-cost LMS-based parameter learning.Both PA nonlinearity and I/Q imbalances were considered in [19,20], and more impairments, including local oscillator leakage and baseband nonlinearity, were analyzed and measured on the Universal Software Radio Peripheral in [7].Some studies have also proposed various frequency domain SIC methods [21,22].In [21], a frequency domain SI canceler based on the parallel Hammerstein (PH) model for an IBFD orthogonal frequency division multiplexing System was proposed, while Ref. [22] utilized a successive cancellation cascaded structure as a replacement for the adaptive filter, effectively reducing the computational load of SIC processing and simplifying the procedure.Most published works are based on the scenario where only the own transmitter (TX) signal is known, without considering the remote transmission (RT) signal.When the RT sends a signal for channel estimation, it can be assumed to be known, and additional prior information will further enhance the performance of SIC [23].
In this paper, a robust multi-layered M-estimate total least mean square (m-MTLS) canceler is proposed to enhance the performance in scenarios where both the data matrix and the observation signal are also affected by impulsive noise.The method exhibits significant robustness against impulse noise, managing to maintain a low convergence error even in the presence of impulse noise.When both the PA nonlinearities of the local transmitter and the training signal from RT are taken into account, we introduce a novel algorithm for nonlinear digital SIC based on a set of generalized Hammerstein polynomials (HPs) [15], by estimating the SI channel and the RT channel at the same time.This method of joint estimation can more accurately estimate the remote channel while reducing the residual SI signal, compared with existing SIC methods.The simulation results show that this method exhibits a faster convergence rate and demonstrably enhances system performance as the signal-to-noise ratio (SNR) increases.
The rest of this paper is organized as follows.In Section 2, the basic STAR system and linear SI signal modeling are first described.Then, considering the nonlinearity introduced by the PA, the SI model is extended to a nonlinear model, followed by the introduction of a joint estimator that can simultaneously estimate both SI and RT channels.Section 3 proposes two new algorithms for linear and nonlinear SIC based on adaptive FIR, respectively.The linear SIC also enhances the robustness of the algorithm under impulsive noise interference.The simulation results are reported in Section 4. Finally, Section 5 concludes the paper.
In the following, continuous and discrete time signals are expressed in italic lowercase, while vectors are denoted by bold lowercase symbols.The acronyms used throughout the paper are also summarized in Table 1.

System Model
The structure of the analyzed STAR system is presented in Figure 2, with signals propagating at the different stages.We denote the baseband transmission at time slot n as x[n], which is perfectly known.In this analysis, the RF components are assumed to be ideal and the nonlinear model will be discussed later.The received signal, after being processed by the LNA and and subjected to SIC in the analog domain, is converted into a baseband signal by the ADC at time slot n and can be written as where r[n] and v[n] denote the desired signal from the remote source and noise at time instant n, respectively.The received signal is assumed to be generated by a function f (•) with L previous transmitted signals.Furthermore, the SI signal from the local TX, s[n], can be represented by a linear model as where w ∈ R N×1 denotes the SI channel and When the nonlinear hardware impairments are taken into account, one particularly important imperfection is PA nonlinearity; thus, we assume that the transmitted signal is only distorted by high-order harmonics of the PA, excluding the effects of I/Q imbalance or phase noise.The PAs exhibit nonlinearity when power-efficient operation is sought, and it is commonly modeled with the well-known PH model [14,19,22,24] .With an input vector x L [n], the output of a P-th order model can be expressed as where P is the highest nonlinearity order of the model, h p,PA (t) represents the response, and is the pth-order basis function.The symbol * represents the convolution operation.We assume that the PA memory length is L 1 and the length of the SI channel is L 2 .Then, the observed SI signal passes through the SI channel and can be rewritten as where w p is the response weight of the entire channel, with a length of Building on the analysis presented earlier, for the purpose of mitigating residual SI within the digital domain, it is imperative to acquire the optimal estimate of the SI channel response, denoted as ŵ[n], corresponding to w[n].This involves subtracting the reconstructed SI signal from y[n].Followed by the SIC process, the remainder of the signal is presumed to embody the RT signal.Subsequent stages may include RT channel estimation and equalization, facilitating the detection of data symbols embedded within the RT signal.
From the SI model mentioned in ( 2) and ( 4), it is evident that the performance of digital SIC is fundamentally contingent upon the accuracy of the channel response coefficient estimation.Moreover, the power of the residual SI signal remains high after RF cancellation.when the interference-to-signal ratio (ISR) is high, even minimal residuals can severely compromise the separation of the RT signal.Additionally, at higher SNRs, noise ceases to be the predominant factor impacting the estimation of the SI signal.Under such circumstances, the primary source of error in SIC emanates from the RT signal itself.Consequently, in scenarios where the RT signal is known, it becomes essential to employ a joint channel estimator to enhance the precision of both SI and RT signal separations, thereby mitigating the aforementioned challenges.We assume that the original RT signal, i[n], is known; then, (1) and ( 2) can be rewritten in [23] as where h[n] denotes the RT channel coefficient, and the interest signal, r[n] is represented as By merging two channels into a joint channel, according to (5), we can obtain the following expression: where ] combines the coefficients of both the SI and RT channels, and . However, by combining the SI and RT channel as a new estimation parameter, only the noise v[n] impacts this process.The benefits of this joint channel estimation approach will be detailed in the simulations presented in Section 4.

Proposed Digital Canceler
When considering the SI and RT channels, it is important to acknowledge that they will vary over time, as the environmental reflection paths are also time-variant.Therefore, we opted for adaptive filtering algorithms and incorporated a multi-layered structure, as shown in Figure 3, to enhance the RT estimation performance of the joint estimator.

Adaptive filter
Adaptive filter Adaptive filter As depicted in Figure 3, the estimation of joint channel coefficients ĉ[n] in the first layer is given by ĉT (1) [n] = [ ŵT (1) [n], ĥT (1) [n]]; then, the residual signal y (2) [n], obtained by subtracting the reconstructed SI signal from y[n], serves as the input for the second layer.Furthermore, we can obtain This procedure is consistently implemented across successive layers of the filter.In the m-th layer, this process is articulated as where y (m) [n] is the input of the m-th layer and ŵ(m) [n] is the estimation of the SI channel, and we have ĉT ] as the estimation of the joint channel.For each layer, while subtracting the residual from the received signal, an estimation of the RT channel ĥ(m) [n] is performed and the output of the last layer y (M+1) [n] is regarded as the interest signal r[n].

Robust m-MTLS Adaptive Algorithm
We consider a linear model where both the independent and dependent variables are subject to measurement errors, which can be illustrated by where w ∈ R L×1 denotes the system vector to be estimated, with x n ∈ R L×1 as the input vector and y n ∈ R as the output signal at time n.The noise vectors u n , distributed as N(0, σ 2 i I) ∈ R L×1 , represent the input noise, and v n , distributed as N(0, σ 2 o ) ∈ R, represent the output noise, where σ 2 i and σ 2 o are the variances in the input and output noise, respectively.Here, 0 indicates the zero vector and I the identity matrix.
Total least squares (TLS) considers errors in all variables, making it particularly useful for more accurate modeling and estimation when errors are present in both predictors and outcomes [25].Its cost function can be formulated as where i is a parameter to normalize the noise variances, ỹn = y n + v n and xn = x n + u n .
Similar to other adaptive algorithms, the iterative formula for TLS based on stochastic gradient descent can be expressed as where µ is the step-size parameter, and ĝ is the instantaneous gradient of J(w), which can be represented as To enhance the robustness of the algorithm when the received signal is subjected to impulsive noise, an M-estimate function can be utilized to improve TLS.This improved algorithm, referred to as the M-estimate total least mean square (MTLS) adaptive algorithm, has its cost function defined as where ρ(•) is the M-estimate function, which is a real-valued even function, given by where ξ = c 1 σe represents the threshold parameter, with c 1 set to 2.576, and σe can be calculated by where , med(•) denotes the median operation, and the parameter λ σ , indicative of the weighting factor, is typically selected within the range of 0.98 to 0.99.The constant c 2 = 1.483(1 + 5/(N w − 1)), and N w represents the length of the window over which the estimation is performed.
By integrating the MTLS adaptive algorithm with the multi-layered joint channel estimator discussed in Section 2, a robust linear SI canceler can be achieved [10].

Multi-Layered Generalized HP-Based Adaptive Algorithm
Nonlinear SIC can be regarded as a problem of nonlinear system identification.Before engaging in adaptive filtering, it is necessary to linearize the SI signal using the basis functions described in (4).
The nonlinear PA model first proposed in [15] employs a set of generalized HPs for its representation.The p-th order nonlinear basis function in Equation ( 3) is defined as which can be rewritten in the digital domain as where c p = c p,0 , . . .c p,p−1 T represents the polynomial coefficients of the p-th basis function.
When the polynomial coefficients satisfy c p,p−1 = 1 and c p,k = 0, ∀k ̸ = p − 1, the basis function simplifies to the standard HP basis function, i.e., ϕ p Then, the SI signal s[n] can be represented by the aforementioned generalized HP as To enhance the convergence speed of the adaptive filtering algorithm, it is also necessary to determine the coefficients c p = c p,0 , . . .c p,p−1 T of the generalized HP basis functions to ensure their orthogonality.A set of orthonormal basis functions is defined as , which satisfies the following orthogonality condition: For polynomial basis functions with the highest order term of 2p − 1 (p > 1), assuming orthogonality with any basis function of order less than 2p − 1, it is possible to derive p − 1 orthogonal equations as Furthermore, we can obtain Assume c p,p−1 = 1, ∀p.When k = 1, we define µ p = E (x[n]) p ; then, ( 23) can be rewritten as Let µ 2b 2a = {µ 2a , µ 2a+2 , . . . ,µ 2b }, 1 ⩽ a ⩽ b.Equation ( 23) can be written in matrix form as where cp is a subvector of the c p = cT p 1 T and C p is an invertible lower triangular matrix.The matrix M p is defined as The instantaneous basis function vector is defined as Integrating the nonlinear PA model with the multi-layered joint channel estimator employing the LMS algorithm results in where T .Furthermore, the iterative formula can be expressed as where µ is the step size.
At the m-th layer, the received signal after digital SIC is The M-layered adaptive nonlinear SIC algorithm is summarized in Algorithm 1 and Table 2.

Linear SI Canceler
In the scenario described by Equation ( 9), the unknown vector w, with a dimensionality of L = 14, conforms to the condition ∥w∥ 2 = 1.Additionally, the elements of w are distributed according to a Gaussian distribution with a mean of zero.The input signal is independently generated from zero-mean Gaussian with unit variance.The input and output noises are denoted as v in = v b and v out = v a + v i , respectively, where v a and v b are Gaussian-distributed with equal variances of σ 2 a = σ 2 b = 0.1.The Bernoulli-Gaussian (BG) process is utilized as an impulsive noise model.The impulsive noise , where b[n] is a Bernoulli process with the probability density function [26].Their performance is displayed in Figure 4, quantitatively evaluated via the normalized mean squared difference (NMSD), defined as = 10 log ∥ ŵn − w∥ 2 /∥w∥ 2 .It can be observed that the LMS and TLS algorithms decline due to noise interference, whereas the MTLS algorithm remains largely unaffected by such disturbances.We consider a baseband STAR system characterized by a 5 MHz signal bandwidth.The system operates at a sampling rate of 10 MHz, with both the local and remote transmitters employing binary phase shift keying modulation for signal transmission.The SI and RT channels have lengths specified as N 1 = 4 and N 2 = 10, respectively.Furthermore, their average energy follows the conditions ISR = E(∥w∥ 2 /∥h∥ 2 ) and SNR = E(∥s(n)∥ 2 /σ 2 a ).The received signal y(n) is subject to corruption from two sources: Gaussian white noise v a and random impulsive noise v i .The occurrence probability of the impulsive noise is denoted as P i .Additionally, the local reference signal i(n) is influenced by Gaussian noise v b , where the condition σ 2 a = σ 2 b holds true.In the context of this simulation, we employ the m-MTLS algorithm with a total of number of M = 2 layers, which is mainly a compromise considering the additional computational cost brought by multi-layer filters.As M increases, the effectiveness of SIC initially improves and then tends to stabilize or even decline.This is due to the propagation of estimation errors across different layers.To assess the efficacy of the proposed multi-layered joint estimator, a comparative analysis is conducted against the minimum mean square error (MMSE) estimator and the single-layer MTLS estimator, focusing on their performance in estimating the RT channel.The physical meaning of the parameters involved in this simulation is summarized in Table 3.

Parameter Symbol Value
Length of SI channel Figure 5 illustrates the performance curves of several algorithms in terms of NMSD as the SNR varies across different noise environments, with ISR = 20 dB.Compared to the MMSE estimator, the proposed method exhibits higher robustness.The MMSE estimator shows a noticeable decline in performance at certain sampling points when affected by impulsive noise pollution.However, thanks to the M-estimate function, the m-MTLS estimator remains virtually unaffected.Furthermore, as the SNR increases, the NMSD of the m-MTLS estimator gradually decreases, whereas the MMSE estimator experiences limited gain due to convergence being disrupted by interference from the interest signal.

Nonlinear SI Canceler
In section, We use the Saleh's PA model [27] to simulate the nonlinear distortion, and the SI signal received after ADC can be described as follows: where γ = 3 and β = 0.03 in this simulation, and h[n] is the overall memory length, set to L = 5.We assess the performance in terms of the mean squared error (MSE), defined as follows: The performance of the proposed algorithm and the commonly used LMS algorithms based on HPs are evaluated and the results are shown in Figure 6.In this simulation, we assume the transmitted data follow a uniform distribution in the range [−2.5, 2.5] and v[n] is zero-mean Gaussian noise with variance 1.
As shown in Figure 6, compared to existing algorithms based on standard HPs, the proposed algorithm achieves faster convergence while maintaining equivalent MSE performance.In the nonlinear SIC scenario, we simulate data transmission using a uniform distribution within the range of [−2.5, 2.5], and the remote signal i[n] ∈ {−1, 1}.The nonlinear SI canceler is in Section 3.2.Considering both computational complexity algorithm performance, the joint SIC algorithm has a two-layer filtering structure with M = 2.The basis functions for all algorithms are of order up to five.
Figures 7 and 8 compare the power spectrum of different signals at the receiver for ISR = 20 dB and ISR 40 dB, respectively.As seen, in both cases, the signal after SIC more closely approximates the remote signal when using the proposed SIC algorithm compared to the method based on HP-LMS.
Figure 9 shows the NMSD and bit error rate (BER) performance of the proposed algorithm and other SIC algorithms.For symbol detection at the receiver, a decision feedback equalizer (DFE) is employed, with both the feedforward and feedback filters having lengths of 15.The DFE employs a truncated version of the transmitted signal as the training signal, with tap weights being trained exclusively during the first block.In this result, as expected, the joint SIC algorithm with a generalized HP basis function with orthogonality exhibits superior performance.

Conclusions
In this paper, we proposed two adaptive algorithms for linear and nonlinear SIC, based on a multi-layered joint channel estimator structure.In the case of linear SI, given that both the independent and dependent variables are subject to measurement errors, and considering the impact of impulsive noise present in wireless environments on the convergence process of adaptive algorithms, the m-MTLS algorithm is proposed to enhance robustness.For the nonlinear distortion generated by the PA, we combine the joint channel estimator and generalized HP basis function to propose a joint SIC method that outperforms current state-of-the-art SIC algorithms in scenarios of both high and low ISR, as well as high SNR.However, the complexity of this method is higher, mainly due to the increase in the number of basis function coefficients and the length of the filter during joint estimation, as well as the number of layers M, which also contributes to the increased complexity.

Figure 1 .
Figure 1.A block diagram illustrating the STAR system and various forms of SIC.

Figure 2 .
Figure 2. The signals propagating at the different stages of the analyzed STAR system.

Figure 3 .
Figure 3.The multi-layered structure of the joint channel estimator.

Algorithm 1
Adaptive algorithm for nonlinear SIC Input: received signal y[n], known signals x[n] and i[n], basis order P, moment collect length N m and basis function generation interval N int Output: signal after SIC r[n] and RT channel estimation ĥT (m) [n] Initialization: n s = 1 and h T joint(m) [n] = 0 1: for n = 1, 2, . . ., N do 2: ] = y (m+1) [n] 20: end for

Figure 5 .
Figure 5.The performance of different estimators under the impulse noise probability.(a) P i = 0.01; (b) P i = 0.05.

Figure 6 .
Figure 6.MSE performance of different SIC algorithms with the same step size, when P = 5.
BER performance of different algorithms.

Table 1 .
Table of acronyms.

Table 2 .
Table of notations.

Table 3 .
Table of parameters.