1. Introduction
Due to limited bandwidth, severe multipath effects, and Doppler shifts, the underwater acoustic (UWA) channel is one of the most challenging communication media [
1,
2,
3]. To mitigate multipath interference and enhance data rates in the band-limited UWA channel, multiple-input multiple-output (MIMO) technology combined with orthogonal frequency-division multiplexing (OFDM) modulation has been adopted for UWA communication systems [
4,
5,
6,
7,
8]. However, the orthogonality among subcarriers may be destroyed by the Doppler shift-induced carrier frequency offset (CFO), leading to inter-carrier interference (ICI) and degraded bit error rate (BER) performance [
7,
9]. Therefore, it is essential to investigate CFO estimation technologies for UWA MIMO-OFDM systems.
Although numerous CFO estimation techniques for MIMO-OFDM systems have been extensively studied in wireless communications [
10,
11,
12,
13], many of them cannot be directly applied to UWA communications due to fundamental differences in channel characteristics. In wireless communication systems, CFO primarily originates from transceiver oscillator mismatches and is generally assumed to be constant within a communication frame. Consequently, most CFO estimation algorithms for wireless MIMO-OFDM systems are derived under this assumption [
10,
11]. In contrast, in UWA communication systems, CFO is mainly caused by residual Doppler frequency shifts. Owing to the time-varying nature of the ocean environment and relative motions between transceivers, this CFO is usually time-varying within a frame. Therefore, the CFO estimation algorithms developed for wireless communication systems under the constant-CFO assumption cannot be directly applied to UWA communication systems. Furthermore, wireless communication systems offer considerably larger bandwidth than the inherently limited bandwidth in UWA communication systems. As a result, many wireless MIMO-OFDM CFO estimators utilize additional training sequences or known OFDM blocks [
12,
13]. In contrast, the significantly lower frequency of acoustic waves (compared to electromagnetic waves) results in a naturally constrained and narrower available bandwidth for UWA communication systems.
Given the unique characteristics of UWA channels, some CFO estimation methods have been developed specifically for UWA single-input single-output (SISO) OFDM communication systems [
9,
14,
15]. For instance, a low-complexity CFO estimator based on minimizing null subcarrier energy is proposed in [
14]. Nevertheless, this approach inherently sacrifices spectral efficiency due to the use of null subcarriers. In contrast, the cyclic prefix (CP) autocorrelation-based estimator in [
15] preserves spectral efficiency. Yet, its accuracy will deteriorate in severe multipath channels. In [
9], equally spaced and identical pilot pattern is employed to construct time locality property. A closed-form CFO estimator with low computational complexity and high accuracy is derived based on this property. However, the adopted pilot pattern induces high Peak-to-Average Power Ratio (PAPR).
When it comes to MIMO-OFDM systems, directly applying most of these UWA SISO-OFDM CFO estimation algorithms to UWA MIMO-OFDM systems is challenging, primarily due to co-channel interference (CCI). The algorithm proposed in [
14], based on the criterion of minimizing the energy on null subcarriers, is less susceptible to CCI in UWA MIMO-OFDM communication systems. Consequently, this algorithm is readily extended to UWA MIMO-OFDM systems in [
7]. This extended algorithm retains advantages such as low computational complexity and high accuracy. It also inherits a significant drawback from the original algorithm, which is a sacrifice in spectral efficiency due to the use of null subcarriers. In [
8], a CFO estimation scheme based on pilot subcarriers is proposed. This scheme operates on the principle of minimizing the Euclidean distance between tentatively equalized pilot symbols and their transmitted counterparts. By eliminating the need for null subcarriers, this scheme achieves higher spectral efficiency than the approach in [
7]. However, a critical limitation of this scheme is its reliance on the least squares (LS) algorithm for tentative channel estimation. In scenarios with relatively long channels or a relatively large number of transmitting transducers, the LS channel estimation may become ill-posed due to an underdetermined problem. This results in inaccurate tentative channel estimation, consequently leading to severe degradation in CFO estimation performance.
To address this issue, this paper proposes a CFO estimation scheme by leveraging the Expectation-Maximization Sparse Bayesian Learning (EM-SBL) for tentative channel estimation. This scheme maintains high tentative channel estimation accuracy thereby ensuring good CFO estimation performance even under ill-posed conditions, whose probability of occurrence increases with the channel length and the number of transducers. However, this scheme necessitates an exhaustive search over a large candidate CFO set, with each candidate point requiring computationally intensive EM-SBL execution. Consequently, the overall complexity becomes high, scaling dramatically with both the channel length and the number of transducers. To further reduce computational complexity, we introduce a lower-complexity Sparse Bayesian Learning (SBL) channel estimator using Variational Bayesian Inference (VBI), tailored for UWA MIMO-OFDM systems. By employing this VBI-based SBL algorithm as the tentative channel estimator, we finally propose a refined CFO estimation scheme with lower computational complexity.
The rest of the paper is organized as follows. 
Section 2 introduces the UWA MIMO-OFDM system model and formulates the problem. In 
Section 3, the proposed CFO estimation scheme based on tentative SBL channel estimation is first developed, and its refined version with lower computational complexity is developed then. A complexity analysis is also provided. 
Section 4 presents the simulation results to verify the advantages of the proposed schemes. Conclusions are drawn in 
Section 5.
  2. System Model
Consider a co-located UWA MIMO-OFDM communication system [
4], comprising 
 transmitting transducers and 
 receiving hydrophones, shown as 
Figure 1. In such co-located UWA MIMO-OFDM system, the transmitted data streams from all transmitting transducers can be considered quasi-time-synchronous [
16], and the slight timing offset caused by minor timing asynchrony can be absorbed into the channel impulse response (CIR) [
12]. Therefore, perfect timing synchronization is assumed. The cyclic prefix (CP) technique is employed to avoid Inter-Symbol Interference (ISI), with the CP length denoted as 
. Each OFDM block contains 
N subcarriers. Among these, 
 subcarriers are allocated for pilot symbols, with position indices 
, and 
 subcarriers are allocated for data symbols, with position indices 
, satisfying 
. All transmitting transducers adopt the same subcarrier mapping scheme, employing an equi-spaced pilot pattern.
In UWA MIMO-OFDM communication systems, a block-by-block processing approach is commonly adopted for handling the received data. The processing method for each OFDM block remains identical [
14]. Therefore, without loss of generality, we focus on a single OFDM block. The vector of symbols to be transmitted by the 
t-th transmitting transducer within an OFDM block can be represented as 
. Let 
 denote the normalized discrete Fourier transform (DFT) matrix of size 
, whose element at position 
 is 
. The corresponding time-domain signal vector of 
 is then given by 
. After adding the cyclic prefix (CP), the final time-domain signal vector ready for transmission is expressed as 
.
Due to relative motion between the transmitter and receiver, non-uniform Doppler shifts exist in UWA communication systems [
14], which should be compensated via a resampling operation [
17]. The resampling factor can be estimated using the preamble and postamble signals located in the header and footer of the communication frame. After performing non-uniform Doppler shift compensation, the baseband received signal at time 
k for the 
r-th receiving hydrophone can be expressed as [
8]
      where 
 denotes the normalized CFO for the transmission data stream from the 
t-th transmitting transducer to the 
r-th receiving hydrophone, defined as the ratio of the actual CFO to the subcarrier spacing 
. 
 represents the CIR between the 
t-th transmitting transducer and the 
r-th receiving hydrophone, where 
 is the channel length. 
 denotes the additive white Gaussian noise (AWGN) with variance 
.
Rewrite Equation (
1) in a compact matrix form. The time-domain received signal vector of the 
r-th receiving hydrophone 
 can be expressed as
      where 
 denotes the additive white Gaussian noise (AWGN) at the 
r-th receiving hydrophone. The covariance matrix of 
 satisfies 
. Here, 
 is the CP insertion matrix, 
 is the time-domain channel matrix, and 
 is the phase rotation matrix induced by the CFO 
:
	  where
        
In subsequent derivations, it is assumed that the entire MIMO-OFDM system experiences a common CFO [
7,
8]. Therefore, the subscripts 
t and 
r for the CFO matrix are omitted. For the time-domain signal received by the 
r-th receiving hydrophone, the time-domain signal obtained after tentative CFO compensation 
 can be expressed as
      where 
 is the tentatively compensated CFO value. Defining the CP removal matrix as 
, the time-domain received signal vector 
 after CP removal can be expressed as
      where 
 denotes the frequency-domain channel response, and 
 represents the phase rotation induced by residual CFO after tentative compensation:
The frequency-domain received signal vector 
 can be expressed as
      where 
 reflects the ICI caused by the CFO, and 
 represents the frequency-domain noise:
If  (i.e., ), subcarrier orthogonality is destroyed, causing energy leakage between subcarriers. Conversely, if  (i.e., ), orthogonality is preserved without energy leakage.
For derivation convenience, assume that the tentative CFO compensation is near-perfect. Thus, the frequency-domain received signal vector 
 can be simplified as
Define the pilot symbol extraction matrix 
, which extracts 
 pilot symbols from the 
N symbols of the frequency-domain received signal vector 
:
Then, the received frequency-domain signal vector at the pilot subcarrier positions, denoted 
, can be expressed as
      where 
, 
, 
. Based on Equation (
7), the channel can be estimated using the least squares method, i.e.,
Based on Equation (
6), the symbol at the position 
 of the frequency-domain received signal vector 
 can be expressed as
Stack the 
 at different receiving hydrophones into a column vector:
To emphasize that this system model is obtained after tentative CFO compensation using 
, we rewrite Equation (
10) as 
. Based on Equation (
10), the LS equalization result of the pilot symbols can be expressed as
Using the LS equalization results of the pilot symbols, we can construct a cost function for estimating the CFO:
If the CFO is compensated perfectly, i.e., 
, 
 will be relatively small. However, if the CFO is not perfectly compensated, i.e., 
, the residual CFO will degrade the accuracy of the channel estimation. This will lead to inaccurate equalization results, resulting in a relatively large 
. Therefore, the CFO estimation problem can be modeled as the following optimization problem:
The above optimization problem can be solved with relatively low computational complexity via a “coarse one-dimensional search followed by the bisection method” [
8]. The accuracy of the channel estimation results directly impacts on the performance of LS equalization. Therefore, accurate channel estimation is crucial for the CFO estimation algorithm based on tentative CFO compensation. However, the LS channel estimation algorithm adopted in CFO estimation scheme proposed in [
8] may fail when the number of pilots is limited [
5], which are detailed next.
In UWA MIMO-OFDM communication systems with equally-spaced non-orthogonal pilots, the accuracy of the LS channel estimation algorithm relies on two prerequisites: 1.  must hold to prevent the problem from becoming underdetermined; 2.  must hold to avoid time-domain aliasing in channel estimation when using equally spaced pilot sampling in the frequency domain. In summary, the LS channel estimation algorithm requires  to ensure estimation accuracy. Meeting this condition becomes more and more difficult as the channel length and number of transmitting transducers increase. In contrast, sparse reconstruction-based channel estimation algorithms, which exploit the inherent channel sparsity, depend on only one prerequisite: 1.  must hold to avoid time-domain aliasing in channel estimation when using equally spaced pilot sampling in the frequency domain. Therefore, sparse reconstruction-based channel estimation algorithms require only  the number of pilots required by the LS algorithm.
  4. Simulations Results
In this section, the effectiveness of our proposed schemes is demonstrated through numerical simulations. First, the schemes are verified using simulation channels generated by a statistical channel model. They are then further validated using channels generated by Bellhop based on a typical deep-sea sound profile. The numerical simulations were conducted using Python 3.9.
We employ a statistical channel model commonly used in underwater acoustic communications [
4,
8,
19,
24]. Channels between each transmitter–receiver pair are mutually independent, with each channel comprising 15 randomly generated multipath components. The inter-path delays follow an exponential distribution with a mean of 2 ms, while path amplitudes follow a Rayleigh distribution. For each signal-to-noise ratio (SNR), 100 Monte Carlo trials are conducted. In each Monte Carlo trial, a new set of MIMO statistical channels is generated. A typical channel realization is shown in 
Figure 2.
The system simulation parameters are presented in 
Table 2. We adopt a non-orthogonal pilot pattern, in which different transmit data streams share the same pilot subcarrier positions. Compared with orthogonal pilot patterns, the non-orthogonal pattern provides more available pilot subcarriers for each transmit data stream, which helps improve parameter estimation performance [
4,
8]. Specifically, the system employs the non-orthogonal pilot pattern proposed in [
8]. This pattern is generated by creating 
 groups of m-sequences of length 511, truncating each m-sequence to a length of 
, and assigning the resulting 
 groups of truncated sequences to the 
 transmit data streams as their pilot symbols. In addition, channel coding is not used in our simulation to more clearly show the performance of different CFO estimation schemes.
Based on the aforementioned statistical channel model and system parameters, we simulate two CFO estimation schemes proposed in this paper: EMSBL-LS-CFOE and VBISBL-LS-CFOE. We also simulate the LS-LS-CFOE scheme proposed in [
8]. Additionally, the OMP-LS-CFOE scheme is simulated, which is derived from the LS-LS-CFOE scheme in [
8] by replacing the tentative LS channel estimation algorithm with the OMP channel estimation algorithm [
18]. To finally calculate the BER performance, each scheme’s tentative channel estimation result corresponding to the final CFO estimation result is provided to the LS equalizer to estimate the transmitted data symbols. The simulation results are shown in 
Figure 3, 
Figure 4, 
Figure 5 and 
Figure 6:
As shown in 
Figure 3, when the channel length satisfies 
, the channel length is long or the number of transducers is large, the performance of the LS channel estimation algorithm degrades significantly as shown in 
Figure 4, leading to deteriorated CFO estimation performance of the LS-LS-CFOE scheme. The OMP-LS-CFOE scheme introduces the classical sparse reconstruction algorithm OMP for tentative channel estimation. Although OMP exploits prior sparsity to overcome the limitations of LS, the mismatch between the prior sparsity and the actual sparsity results in unsatisfactory tentative channel estimation performance. In contrast, the tentative EM-SBL channel estimation algorithm in our proposed EMSBL-LS-CFOE scheme adapts to the actual sparsity of the channel, achieving desirable tentative channel estimation performance, which consequently leads to improved CFO estimation performance. With lower computational complexity, our proposed VBISBL-LS-CFOE scheme achieves comparable or even slightly better CFO estimation performance than the EMSBL-LS-CFOE scheme. This is because the VBI-SBL channel estimation technique decomposes the high-dimensional channel into 
 low-dimensional subchannels, which are then estimated separately. This approach not only reduces computational complexity but also better adapts to the distinct sparsity of each subchannel. As shown in 
Figure 5, the demodulation bit error rates of communication systems based on our proposed EMSBL-LS-CFOE and VBISBL-LS-CFOE schemes are, across the entire SNR range, close to that of the genie-aided system with perfect CFO estimation and perfect channel state information. Even without specialized optimization of other components in a communication system, our results demonstrate promising BER performance. In this 2 × 2 MIMO system configuration, a satisfactory BER on the order of 
 can be achieved at a relatively high SNR, indicating that the CFO has been estimated and compensated effectively. Furthermore, the constellation diagrams of the equalized symbols for different CFO estimation schemes, presented in 
Figure 6, provide intuitive visual confirmation of the conclusions drawn from the BER performance analysis in 
Figure 5. In addition, it should be noted that all the four simulated CFO estimation schemes have equal data rate:
Additional simulations are conducted to demonstrate the performance of our proposed schemes under different numbers of pilot subcarriers and different modulation order. As shown in 
Figure 7, 
Figure 8 and 
Figure 9, the CFO estimation performance of our proposed scheme improves as the number of pilot subcarriers increases. However, when the number of pilot subcarriers is too low, the CFO estimation performance degrades due to a lack of measurement data required for the SBL algorithms. As shown in 
Figure 10, 
Figure 11 and 
Figure 12, the proposed CFO estimation schemes maintain robustness against higher-order modulations. This inherent advantage stems from the fact that our CFO estimation relies solely on the tentatively equalized pilot symbols, which in turn depends on the tentative channel estimation. The high-accuracy tentative channel estimation techniques employed in our proposed schemes use only the pilot symbols on pilot subcarriers and is therefore independent of the modulation order of data subcarriers. Consequently, the CFO estimation performance remains robust regardless of the modulation order. However, as observed in the BER curves of 
Figure 12, the 16-QAM situation exhibits slightly higher BER compared to the QPSK situation at the same SNR. This is expected and attributable to the reduced Euclidean distance between constellation points in higher-order modulations, which increases the difficulty of equalization and symbol detection, rather than being a limitation of our CFO estimation schemes.
To demonstrate the performance of our proposed schemes under asymmetric MIMO configurations, we further consider a MIMO system where the number of transmitting transducers differs from the number of receiving hydrophones, i.e., 
, 
. The corresponding simulation results are shown in 
Figure 13, 
Figure 14 and 
Figure 15. As observed from 
Figure 13, the CFO estimation performance for the 2 × 4 MIMO configuration is superior to that of the 2 × 2 MIMO configuration at the same SNR. This improvement is attributed to the spatial diversity gain, which enhances the performance of the tentative equalization, thereby leading to more accurate CFO estimation. Furthermore, the improved CFO estimation accuracy subsequently results in better channel estimation performance for the 2 × 4 MIMO configuration compared to the 2 × 2 MIMO configuration case, as shown in 
Figure 14. Finally, 
Figure 15 demonstrates that the BER performance of the 2 × 4 MIMO configuration is significantly better than that of the 2 × 2 MIMO configuration under the same SNR. This overall gain stems from the combined benefits of improved CFO estimation, enhanced channel estimation, and the direct boost to equalization performance provided by the spatial diversity gain.
To demonstrate the performance of our proposed schemes in different MIMO system configurations, we further consider a MIMO system with more transmitting transducers and receiving hydrophones, i.e., 
. The corresponding simulation results are shown in 
Figure 16, 
Figure 17 and 
Figure 18. As can be observed from 
Figure 16, 
Figure 17 and 
Figure 18, as the number of transmitting transducers increase, our proposed schemes can maintain satisfactory CFO estimation performance.
To validate our proposed schemes in different types of UWA channels, we further simulated a deep-sea channel using Bellhop  [
25,
26]. The deep-sea sound speed profile (SSP) and the simulated UWA channels generated based on this SSP are shown in 
Figure 19 and 
Figure 20, respectively. The simulation results of our proposed schemes under these Bellhop-simulated channels are shown in 
Figure 21, 
Figure 22 and 
Figure 23. It can be seen that our proposed schemes can maintain satisfactory CFO estimation performance, demonstrating their robustness to different types of UWA channels.
Finally, to directly compare the computational complexity, the number of complex multiplication operations required by the proposed EMSBL-LS-CFOE and VBISBL-LS-CFOE schemes is calculated under varying numbers of transmitting transducers and channel lengths. As shown in 
Figure 24, the proposed VBISBL-LS-CFOE scheme consistently requires fewer operations than the EMSBL-LS-CFOE scheme. Moreover, this computational complexity advantage widens as the number of transmitting transducers and the channel length increase.