Toward Intelligent Underwater Acoustic Systems: Systematic Insights into Channel Estimation and Modulation Methods

Imran A. Tasadduq; Muhammad Rashid

doi:10.3390/electronics14152953

and

Computer and Network Engineering Department, Umm Al-Qura University, Makkah 21955, Saudi Arabia

^*

Authors to whom correspondence should be addressed.

Electronics2025, 14(15), 2953;https://doi.org/10.3390/electronics14152953

This article belongs to the Section Artificial Intelligence

Version Notes

Order Reprints

Review Reports

Abstract

Underwater acoustic (UWA) communication supports many critical applications but still faces several physical-layer signal processing challenges. In response, recent advances in machine learning (ML) and deep learning (DL) offer promising solutions to improve signal detection, modulation adaptability, and classification accuracy. These developments highlight the need for a systematic evaluation to compare various ML/DL models and assess their performance across diverse underwater conditions. However, most existing reviews on ML/DL-based UWA communication focus on isolated approaches rather than integrated system-level perspectives, which limits cross-domain insights and reduces their relevance to practical underwater deployments. Consequently, this systematic literature review (SLR) synthesizes 43 studies (2020–2025) on ML and DL approaches for UWA communication, covering channel estimation, adaptive modulation, and modulation recognition across both single- and multi-carrier systems. The findings reveal that models such as convolutional neural networks (CNNs), long short-term memory networks (LSTMs), and generative adversarial networks (GANs) enhance channel estimation performance, achieving error reductions and bit error rate (BER) gains ranging from

10^{- 3}

to

10^{- 6}

. Adaptive modulation techniques incorporating support vector machines (SVMs), CNNs, and reinforcement learning (RL) attain classification accuracies exceeding 98% and throughput improvements of up to 25%. For modulation recognition, architectures like sequence CNNs, residual networks, and hybrid convolutional–recurrent models achieve up to 99.38% accuracy with latency below 10 ms. These performance metrics underscore the viability of ML/DL-based solutions in optimizing physical-layer tasks for real-world UWA deployments. Finally, the SLR identifies key challenges in UWA communication, including high complexity, limited data, fragmented performance metrics, deployment realities, energy constraints and poor scalability. It also outlines future directions like lightweight models, physics-informed learning, advanced RL strategies, intelligent resource allocation, and robust feature fusion to build reliable and intelligent underwater systems.

Keywords:

channel estimation; adaptive modulation; modulation recognition; underwater acoustic communication; machine learning; deep learning

1. Introduction

Underwater acoustic (UWA) communication is used in several critical areas such as environmental monitoring, deep-sea exploration, military operations, rescue missions, commercial maritime applications, and underwater robotics [1]. Electromagnetic waves, commonly used for wireless communication on land, do not work efficiently underwater due to high absorption and scattering. Consequently, UWA communication relies on sound waves to transmit and receive information [2]. These sound waves move efficiently through water, which makes it possible to communicate over long distances, even deep under the sea [3].

Despite its importance, UWA communication faces persistent challenges due to the complex nature of underwater channels. These challenges include multipath propagation, channel time variations, and unpredictable channel conditions [4,5]. Multipath propagation occurs when sound waves reflect off underwater surfaces. It causes multiple delayed signal replicas that interfere at the receiver. These echoes distort signal timing and undermine the effectiveness of traditional synchronization techniques. Consequently, it often leads to symbol misalignment and degraded decoding accuracy. Similarly, environmental dynamics such as currents, temperature gradients, and object movement introduce time-varying fading and unpredictable Doppler shifts. Large Doppler spreads distort carrier frequencies and interfere with signal stability. This makes it difficult for conventional modulation schemes to adapt effectively in real time [6]. Bandwidth constraints and slow sound propagation (around 1500 m/s) further limit the achievable data rates and increase latency [7]. These physical-layer challenges necessitate intelligent signal processing approaches tailored for underwater conditions.

To address these environmental challenges, physical-layer modules like channel estimation, adaptive modulation, and modulation recognition play an important role [1]. These components are essential for compensating multipath-induced distortion and ensuring robust synchronization. Channel estimation aims to extract channel impulse responses and delay profiles. Adaptive modulation dynamically selects optimal modulation schemes based on real-time feedback from the channel. Under severe Doppler shifts, RL-based modulation controllers offer enhanced environmental awareness and decision-making capabilities to maintain reliable throughput. Modulation recognition detects the transmitted modulation format and is essential for spectrum monitoring, interference management, and dynamic demodulation. Together, these modules form the backbone of underwater physical-layer reliability and efficiency.

In recent years, machine learning (ML) and deep learning (DL) tools have emerged to overcome these physical-layer limitations [4,5,8,9]. ML models, such as Convolutional Neural Networks (CNNs) and Long Short-Term Memory networks (LSTMs), facilitate accurate channel estimation by learning temporal and spatial signal patterns. This pattern recognition enables adaptive synchronization under challenging multipath conditions. For adaptive modulation, reinforcement learning (RL) algorithms continuously assess channel states, including Doppler shifts and fading variations. Based on this assessment, they select modulation schemes that improve both transmission speed and signal robustness. In modulation recognition, DL models such as CNNs and Recurrent Neural Networks (RNNs) are able to extract discriminative features from noisy acoustic signals. These models outperform traditional rule-based classifiers in accurately identifying complex modulation schemes. Collectively, ML/DL-based solutions offer data-driven adaptability, improving decoding accuracy and operational reliability in complex underwater environments.

1.1. Motivation for a Systematic Literature Review

A review is needed for UWA communication because the current research on ML/DL-based UWA communication is scattered. Researchers have frequently applied ML/DL techniques for enhancing channel estimation, adaptive modulation, and modulation recognition. However, they often use different methods, datasets, and evaluation tools. Moreover, their results exhibit differences in training loss, modulation schemes, throughput gains, and BER (Bit Error Rate) improvements. This makes it hard to compare results or draw clear conclusions. A well-organized review can consolidate the findings from multiple studies, offering a comparative analysis of ML/DL-based approaches. Given the ongoing challenges of multipath interference, extreme Doppler shifts, limited training data, and resource constraints in UWA systems, a systematic literature review (SLR) becomes important in guiding researchers toward developing more robust, scalable, and efficient AI-driven solutions.

1.2. State-of-the-Art Review Articles and Their Limitations

Table 1 presents a summary of existing reviews on ML and DL techniques in UWA communication. As shown in the table, previous reviews explore the potential of ML and DL approaches to enhance system robustness, optimize communication protocols, and improve signal processing in complex underwater environments. While these contributions are valuable, they exhibit notable research gaps. First, they lack a holistic evaluation of ML and DL techniques for channel estimation, modulation recognition, and adaptive modulation. Second, there is no detailed comparative analysis of ML and DL algorithms, including their underlying system characteristics and performance metrics. Finally, the broader impact of ML and DL advancements on the efficiency, scalability, and reliability of UWA communication systems remains underexplored. This systematic literature review provides the first integrated synthesis of ML and DL techniques across three key physical-layer challenges in UWA communication: channel estimation, adaptive modulation, and modulation recognition. Unlike prior reviews, it offers structured comparisons of model architectures, performance metrics, and deployment feasibility.

Table 1. Summary of existing reviews on ML/DL techniques in UWA communication.

1.3. Research Questions

The limitations identified in Table 1 are addressed through this SLR, which systematically investigates five core research questions.

Research Question 1 (RQ1): How do ML and DL techniques improve channel estimation in UWA communication, and what are the key system characteristics and performance metrics of these methods?

Research Question 2 (RQ2): How do ML and DL techniques improve adaptive modulation in UWA communication, and what are the key system characteristics and performance metrics of these methods?

Research Question 3 (RQ3): How effective are ML/DL-driven modulation recognition approaches in identifying modulation schemes under complex underwater conditions, and what are their strengths and limitations?

Research Question 4 (RQ4): What innovative approaches and emerging trends in machine/deep learning can be employed to address unresolved challenges in underwater acoustic communication, and how can these advancements shape the future of intelligent, efficient, and scalable UWA systems?

1.4. SLR Framework

Figure 1 shows the framework of this SLR. We selected studies from four databases: IEEE, Springer, Elsevier, and Google Scholar. The inclusion and exclusion criteria are explained in Section 2. The selected research was grouped into three areas. These are channel estimation, adaptive modulation, and modulation recognition. Section 3 discusses the ML and DL methods for channel estimation. It shows how these models predict signal distortions and improve decoding accuracy. Section 4 focuses on adaptive modulation. It explains how models adjust transmission settings to match underwater conditions. Section 5 examines modulation recognition. It highlights how ML models classify signals in noisy environments with high accuracy. Section 6 outlines key challenges and suggests future directions. Section 7 gives answers to the research questions. Section 8 describes search and scope limitations. Section 9 presents the conclusion.

Figure 1. Overview of the SLR: from the article selection to comparative analysis.

2. Literature Review Methodology

To address the research questions in the introduction, we followed the established SLR guidelines [12], implemented in [13,14,15,16]. First, Section 2.1 presents the background and category definitions. Subsequently, Section 2.2 outlines the structured protocol used for selecting and reviewing relevant studies.

2.1. Detailed Exploration of Category Backgrounds

The three main areas of this SLR (channel estimation, adaptive modulation, and modulation recognition) are briefly introduced in the following discussion.

2.1.1. Channel Estimation

Figure 2 shows a UWA OFDM system that applies ML and DL models [17,18,19]. The figure represents a multi-carrier setup but can be adapted to single-carrier systems by removing certain blocks. It is a general diagram and does not include the internal structure of the learning models. Typically, ML or DL models are first trained offline. After that, they are deployed in the live system with online updates or tuning. Common modulation schemes include BPSK (Binary Phase Shift Keying), QPSK (Quadrature Phase Shift Keying), and OFDM (Orthogonal Frequency Division Multiplexing). Performance is typically evaluated using MSE (Mean Squared Error) and BER [20,21,22,23,24].

Figure 2. A typical UWA OFDM (a) transmitter and (b) receiver.

At the transmitter, input data is modulated using QPSK or 16-QAM (Quadrature Amplitude Modulation) [25,26,27,28]. The modulated symbols are split into parallel streams and passed through an inverse fast Fourier transform (IFFT) to produce OFDM signals. A cyclic prefix (CP) is added to reduce inter-symbol interference from multipath effects in UWA environments. The signal is amplified and transmitted through the underwater channel [21,29,30,31]. At the receiver, the signal first passes through a bandpass filter to remove noise. The CP is removed, and the signal undergoes fast Fourier transform (FFT) to return to the frequency domain. ML/DL models such as DNNs and RNNs perform channel estimation by analyzing distortions and adjusting for UWA variations. The estimated channel improves demodulation and supports accurate data recovery. This ML/DL-driven approach enhances signal reliability, reduces BER, and optimizes communication efficiency in challenging underwater environments [32,33,34].

2.1.2. Adaptive Modulation

Figure 3 illustrates an ML/DL-based UWA transceiver for adaptive modulation. Data bits enter a channel encoder for error protection. A modulator maps the encoded bits into symbols. An ML model at the receiver estimates channel quality from the incoming signal. This estimate feeds back to a modulation selector that picks the best scheme for the next transmission. The signal is then amplified and sent through the underwater channel, where it faces attenuation, multipath, Doppler shifts, and noise. At the receiver, a filter removes out-of-band noise, and a demodulator recovers the symbols. A channel decoder restores the original bits. Meanwhile, the ML model continuously learns from channel observations to refine its future decisions [19,35,36,37,38,39,40,41,42].

Figure 3. A typical UWA transceiver using ML/DL for adaptive modulation.

Adaptive modulation in UWA systems adjusts transmission schemes based on real-time conditions to ensure stable data flow [38,41,42,43]. Changing factors like temperature, salinity, and noise make fixed schemes unreliable. ML/DL methods select robust options such as BPSK or Frequency Shift Keying (FSK) in poor conditions, and higher-rate schemes like 16-QAM or OFDM when channels are stable. RL models automate this choice using past transmission data. These systems typically use 5–10 kHz bandwidths and support short to long distances. ML/DL improves throughput, lowers BER, and enables energy-efficient signal processing for underwater devices.

2.1.3. Modulation Recognition

Figure 4 shows an AI-powered modulation recognition system for UWA communication. The transmitter encodes and modulates the data before sending it through an underwater channel affected by noise, Doppler shifts, and attenuation. At the receiver, ML/DL models perform channel estimation and recover the original signal. An RL-based recognition block identifies the modulation type by learning from previous transmissions, lowering BER, and ensuring reliable communication in both single and multi-carrier setups [9,44,45,46,47,48].

Figure 4. A typical UWA transceiver using ML/DL for modulation recognition.

Modulation recognition enables underwater receivers to identify signal types automatically [49,50,51,52]. Intelligent detection improves efficiency and reduces errors in dynamic underwater environments. DL models such as CNNs and RNNs help in classifying modulation types by analyzing received signals and extracting features. These systems match signals to known patterns like BPSK, QPSK, QAM, FSK, DSSS (Direct Sequence Spread Spectrum), and OFDM. They operate across 1–30 kHz bandwidths and support distances from hundreds of meters to several kilometers. Metrics like cross-entropy (CE) loss, accuracy, and precision assess model performance [53,54,55,56].

2.2. Development of the Review Protocol

This section defines the selection and rejection criteria (Section 2.2.1), outlines the search strategy (Section 2.2.2), and presents the synthesis of selected studies (Section 2.2.3).

2.2.1. Criteria for Selection and Rejection

Subject Relevance: Research must be directly relevant to the context of this study.
Publication Date (2020–2025): Only research published between 2020 and 2025 is included. Studies published before 2020 are excluded.
Publisher: Selected research must be published in one of the three renowned scientific databases (IEEE, Springer, or Elsevier). To ensure comprehensive coverage, the first 10 pages of Google Scholar were searched for each key term, allowing consideration of articles from other databases.
Impactful Contributions: Selected research must present key advancements in UWA communication, using ML/DL to enhance channel estimation, adaptive modulation or modulation recognition.
Results Oriented: Studies with proposals and findings supported by solid evidence, facts, and experimental validation are favored.
Repetition: Identical or redundant research within the same context is excluded.

2.2.2. Literature Search Process

This SLR emphasizes high-quality sources to examine ML/DL in UWA communication. IEEE, Elsevier, and Springer were chosen for their broad coverage and peer-reviewed rigor. Journal articles were prioritized over conferences for their depth and validation. To extend scope, Google Scholar was searched (first 10 pages per term) to capture impactful studies not indexed in standard databases. Table 2 summarizes the results for search terms like channel estimation, adaptive modulation, and modulation recognition, using a 2020–2025 filter across the three major databases. Figure 5 outlines the selection process: 41,261 initial entries reduced to 43 through title, abstract, and general screening based on strict inclusion criteria. This ensures coverage of core advances and challenges in the field.

Table 2. Search results for ML and DL techniques in UWA communication (2020–2025).

Figure 5. Stepwise selection and screening procedure for research article inclusion.

2.2.3. Systematic Approach Used in Extracting and Analyzing Studies

Table 3 provides a framework for extracting, analyzing, and classifying research studies. It categorizes studies into key domains: channel estimation, adaptive modulation, and modulation recognition. Consequently, it enables comparative assessments of methodologies and performance metrics. Additionally, Figure 6 presents statistical data on research articles, organized by publication year.

Table 3. Systematic process for extracting, analyzing, and classifying research studies in ML/DL-based UWA communication: outlining key data collection methods, evaluation criteria, and classification approaches.

Figure 6. Year-Wise Distribution of Selected Research Articles from WoS-Indexed Journals (2020–2025), Illustrating Publication Trends and the Evolving Focus on ML/DL Applications in UWA Communication.

Table 4. Overview of ML/DL-based channel estimation in SC-UWA communication.

Ref.	ML/DL Technique	Optimizer	Training Examples
[17]	LR	GD	Measured Data
[20]	LSTM, BiLSTM, SBULSTM	—	Simulated (BELLHOP) and Measured Data
[21]	ABiGRU	Adam	Measured and Simulated
[57]	AttLstmPreNet	—	Simulated Data (From [25])
[29]	UACC-GAN	Adam	Measured Data (From [26])

Table 5. Overview of ML/DL-based channel estimation in MC-UWA communication.

Ref.	ML/DL Technique	Optimizer	Training Examples
[18]	CsiPreNet	Adam	Measured Data
[22]	DNN	Adam	Measured Data (From [26])
[58]	DAE and DNN	—	Simulated Data (GBG)
[23]	CNN	Adam	Measured and Simulated
[24]	2D BiLSTM	Adam	Measured and Simulated
[28]	UDNet	Adam	Measured and Simulated
[59]	SC-CNN, AM-BiLSTM	RMSprop	Measured Data (From [26])
[60]	CWGAN-GP, CNN, CAD	—	Simulated Data (From [25])
[27]	BDPCNN	Pelican	Unclear
[29]	UACC-GAN	Adam	Measured Data (From [26])
[32]	LSTM and Transformer	Adam	Measured Data
[30]	DenseNet	Adam	Measured Data (From [26])
[31]	CNN and LSTM	Adam	Measured Data (From [26])
[33]	DNN	Adam	Simulated Data
[34]	S-CNN-ResNet	Adam	Simulated Data

Table 6. Characteristics of ML/DL-based channel estimation in SC-UWA communication.

Ref	Bandwidth (KHz)	Tx-Rx Distance (km)	Modulation
[17]	2	0.205	BPSK
[20]	4	2 to 3	BPSK, DS-SS
[21]	14 to 18	1	QPSK
[57]	10 & 5	1	—
[29]	10 to 18	—	FH-SS

Table 7. Characteristics of ML/DL-based channel estimation in MC-UWA communication.

Ref	Bandwidth (KHz)	Tx-Rx Distance (km)	System	Subcarr.	Modulation	CP
[18]	4	1 to 5	OFDMA	681	Various	25
[22]	—	0.54 to 3.16	OFDM	64	QPSK	16
[58]	—	—	OFDM	512	16QAM	64
[23]	—	0.75, 3	OFDM	—	QPSK	—
[24]	5	1	OFDM	462	—	22.6 ms
[28]	5	1.5, 3	OFDM	1024	Various	256
[59]	6	0.54, 0.75, 3.16	OFDM	512	BPSK	128
[60]	6–10	0.75, 1.08	OFDM	512	QPSK	128
[27]	1000	—	MIMO-OFDM	512	QPSK	128
[29]	10–18	—	OFDM	1024	QPSK	256
[32]	—	0.883, 0.967	OFDM	—	—	—
[30]	32.5–37.5	0.8, 1.08, 3.16	OFDM	1024	BPSK, QPSK	256
[31]	2	—	OFDM	64	QPSK	16
[33]	—	—	AFDM	32, 128	BPSK, QPSK	0
[34]	4	—	OTFS	—	BPSK	—

Table 8. Comparison for ML/DL-based channel estimation in SC-UWA communication.

Ref.	Training Loss	Complexity	Channel Prediction	Gain (BER)
[17]	CE	—	66% Acc., 88% Prec.	—
[20]	CE	—	—	$1.2 \times 10^{- 3}$
[57]	$4 \times 10^{- 3}$ (MAE)	—	7% better	—
[29]	WGAN-GP	—	TVIR, CDF, JS Divergence, Entropy	—

Table 9. Comparison for ML/DL-based channel techniques in MC-UWA communication.

Ref.	Training Loss	Complexity	Channel Prediction	Gain (BER)
[18]	0.025 (MAE)	Higher (Big-O, Runtime)	—	$0.5 \times 10^{- 6}$
[22]	0.00012 (MSE)	Medium $O (L K^{2})$	Near optimal	40%
[58]	0.1 (L2)	—	—	$1 \times 10^{- 2}$ at an SNR of 20 dB
[23]	Combines MSE and BER	47.8 ms	—	Improvement of over 0.17
[24]	MSE	—	—	$1 \times 10^{- 3}$ (ComNet)
[28]	Not Given (MSE)	Same (Time complexity)	$- 24$ dB (NMSE)	$4 \times 10^{- 4}$ (AMP)
[59]	$0.018$ (MSE)	$2.12$ MB (Memory), 561,526 (Para.), $1.04 \times 10^{6}$ (FLOPs)	—	$1 \times 10^{- 3}$ (ComNet)
[60]	—	—	$2 \times 10^{- 6}$ (MSE)	$0.5 \times 10^{- 3}$ (ChannelNet)
[27]	$0.01$ (MSE)	$0.2$ MB (Memory), 8956 (P), Lower by $0.5 \times 10^{5}$ (FLOPs)	$0.25 \times 10^{- 6}$ (MSE)	$1 \times 10^{- 3}$ (biLSTM)
[29]	WGAN-GP	—	TVIR, CDF, JS Divergence, Entropy	Close to measurements
[32]	$4.09 \times 10^{- 3}$ (MSE)	8,460,928 (Par.), $23.7$ s (T. Time), $83.3$ ms (P. Time)	$9 \times 10^{- 4}$ (MSE)	Not Done
[30]	$0.05$ (MSE)	Lower (CNN)	TVIR $\approx 0$	$7 \times 10^{- 4}$ (FC-NN)
[31]	MSE	Not Done	Not Done	$1 \times 10^{- 2}$ (ComNet)
[33]	MSE	Lower by $21.3$ ms (Runtime)	$1 \times 10^{- 4}$ (NMSE)	$1.7 \times 10^{- 2}$ (LMMSE)
[34]	0.02 (MSE)	20.70 MB, 4,996,384 P, $2.48 \times 10^{8}$ F	3 dB gain at $10^{- 3}$	$1 \times 10^{- 7}$ (CNN-ResNet)

Table 10. Overview of ML/DL-based adaptive modulation in SC-UWA communication.

Ref.	ML/DL Technique	Optimizer	Training Examples
[43]	SVM, KNN, LDA, BRT	—	Measured Data
[36]	MLR, MLP	—	Measured Data
[38]	CNN	Adam	Simulated Data
[41]	RL	Not Applicable	No Dataset Used
[42]	RL	Not Applicable	Measured Data

Table 11. Overview of ML/DL-based adaptive modulation in MC-UWA communication.

Ref.	ML/DL Technique	Optimizer	Training Examples
[35]	A-kNN	Not Applicable	Measured Data
[37]	RL	PPO	Simulated (Bellhop) & Measured Data
[38]	CNN	Adam	Simulated Data
[39]	RL	Adam	Simulated Data (Bellhop)
[40]	CNN	Adam	Measured Data
[41]	RL	Not Applicable	No Dataset Used
[42]	RL	Not Applicable	Measured Data

Table 12. Characteristics of adaptive modulation techniques in SC-UWA communication.

Ref.	Bandwidth	Tx-Rx Distance	Modulation
[43]	5 kHz, 10 kHz	1 km, 2 km, 3 km	PSK, QAM
[36]	4 kHz	1 km	PSK, QAM
[38]	—	—	CDMA, TDMA
[41]	—	5 to 355 cm	FH-BFSK, ASK, PSK
[42]	5 kHz	0.82 km	FSK, DS-SS

Table 13. Characteristics of adaptive modulation techniques in MC-UWA communication.

Ref.	Bandwidth	Tx-Rx Distance	System	Subcarr.	Modulation	CP
[35]	—	—	OFDM	—	FSK	—
[37]	6 kHz	5 km	OFDM	256	PSK, QAM	64
[38]	—	—	OFDM	—	—	—
[39]	8 kHz	5 km	OFDM	1024	PSK, QAM	400
[40]	6 kHz	0.3 km to 1.5 km	OTFS	32	PSK	—
[41]	—	5 to 355 cm	OFDM	—	—	—
[42]	5 kHz	0.82 km	OFDM	200	—	—

Table 14. Comparison of adaptive modulation techniques in SC-UWA communication.

Ref.	Training Loss	Complexity	Throughput	Gain in BER
[43]	MSE	—	$> 99 %$ (Acc.)	—
[36]	—	—	25% higher	Comparable
[38]	—	—	$> 98 %$ (Acc.)	Substantial
[41]	Not Applicable	—	3.648% (RSSI)	32%
[42]	TD	—	14.8% better	$4.5 \times 10^{- 3}$

Table 15. Comparison of adaptive modulation techniques in MC-UWA communication.

Ref.	Training Loss	Complexity	Throughput or Equivalent	Gain in BER
[35]	Not Applicable	Less	Near ideal	Near ideal
[37]	Actor and Critic Loss	—	4% higher	Better than others
[38]	—	—	$> 98 %$ (Accuracy)	Substantial
[39]	Not Given (MSE)	High	Higher than others	Better
[40]	CE	Higher	4% higher	Kept at $0.001$ throughout
[41]	Not Applicable	—	3.648% (RSSI)	32%
[42]	Not Applicable	—	14.8% higher	Stable

Table 16. Overview of ML/DL-based modulation recognition.

Ref.	ML/DL Technique	Optimizer	Training Examples
[49]	SCNet	Adam	Simulated Data
[50]	OAE-EEKNN	GD & Adam	Measured Data
[51]	RNN & CNN	Adam	Measured Data
[52]	SCL	Adam	Simulated and Measured Data
[53]	ResNet	Adam	Simulated and Measured Data
[54]	CNN with Ensemble Learning	—	Simulated Data
[55]	DCN	—	Simulated and Measured Data
[56]	SVM	—	Simulated and Measured Data
[46]	SqueezeNet and SENet	—	Simulated and Measured Data
[47]	Hybrid	Adam	Measured Data
[48]	TSTR	Adam	Measured Data (from [26])
[44]	NAS	MSGD	Simulated and Measured Data
[9]	2D ResNet & CNN	—	Measured Data
[45]	Hybrid	—	Simulated and Measured Data

Table 17. Characteristics of ML/DL-based modulation recognition techniques.

Ref.	Bandwidth	Tx-Rx Distance	Modulations Considered
[49]	1000 symbols/s	1.5 km	PSK, QAM, SSB, FM, PAM, FSK
[50]	10 kHz	1 km	FSK, PSK, QAM, DSSS, OFDM
[51]	4 kHz	1 km	PSK, FSK, OFDM
[52]	—	3, 6, 12, 60 m	PSK, FSK
[53]	4 kHz	1 km	PSK, FSK, OFDM
[54]	—	45 m	PSK, FSK, QAM
[55]	—	3 km, 5 km	PSK, QAM
[56]	100 kHz	6 m	FSK
[46]	—	10, 500, 1000 m	PSK, FSK, DSSS, OFDM
[47]	—	7 m, 1 km	PSK, FSK, QAM, DSSS, OFDM
[48]	—	—	PSK, QAM, FSK
[44]	1 kHz	100 m to 3 km	LFM, FSK, PSK, DSSS, OFDM
[9]	3 kHz, 10 kHz	1.22 km	FSK, PSK, CW, DSSS, LFM, OFDM
[45]	31.25 kHz	5 km, 1.25 km	PSK, QAM

Table 18. Comparative analysis of modulation recognition techniques.

Ref.	Training Loss	Complexity	Accuracy	Precision
[49]	0.01 (CCE)	Lower (153,930 P)	95.3%	89% to 100%
[50]	MSE	Lower (3.48 ms)	99.25%	—
[51]	$1.946$ (CE)	Lower (7.164 ms)	99.38%	—
[52]	Contrastive Loss	—	98.6%	66% to 100%
[53]	$10^{- 8}$ (CE)	—	100%	40% to 100%
[54]	Negative Log Likelihood	Lower (4.5 ms, $0.5 \times 10^{6}$ P, $0.5 \times 10^{8}$ FLOPs)	93.4%	33.33% to 100%
[55]	CE	—	64% to 73%	58.8% to 100%
[56]	0.10 (Hinge Loss)	Low (SVM efficiency)	98.28% to 99.78%	99.90% to 99.94%
[46]	—	Lower by 9 times (No. of parameters)	98.5%	97% to 99%
[47]	Nearly zero (Not Given)	Lower (0.28 M parameters, 7.02 M FLOPs)	99%	—
[48]	CE	Medium (145.9 k parameters)	86% to 91.1%	61% to 100%
[44]	CE	High (3.10 M P, 0.58 M FLOPs)	92.2%	80% to 100%
[9]	0.2 (CE)	High (38.48 M P, 1753 s)	94.31%	91.02%
[45]	—	Medium (2.23 ms to 8720.94 ms)	64% to 83%	61% to 100%

3. Results on ML/DL-Based Channel Estimation

This section outlines ML/DL techniques for channel estimation in UWA communication. First, Section 3.1 introduces the models, optimizers, and training examples. Second, Section 3.2 details system features like bandwidth, Tx-Rx distance, modulation schemes, subcarriers, and CP. Subsequently, Section 3.3 covers metrics such as training loss, BER, and complexity. Finally, Section 3.4 and Section 3.5 discuss key findings in single- and multi-carrier systems.

3.1. Overview of Channel Estimation Approaches

Table 4 and Table 5 summarize ML/DL methods, optimizers, and training data for SC-UWA (Single-Carrier Underwater Acoustic) and MC-UWA (Multi-Carrier Underwater Acoustic) systems, respectively. SC-UWA transmits data using a single carrier and is characterized by simple, energy-efficient designs. In contrast, MC-UWA systems use multiple subcarriers to improve robustness in complex underwater environments. SC-UWA systems primarily use lightweight models such as LR (Linear Regression), LSTM, ABiGRU (Attention Bidirectional Gated Recurrent Unit), and UACC-GAN (Underwater Acoustic Communication Channel–Generative Adversarial Network). Common optimizers include Adam, Adagrad, Adadelta, and Nadam. Among these, Adam is widely preferred due to its ability to adapt the learning rate during training. MC-UWA systems utilize more complex models. These include CNNs, BiLSTMs (Bidirectional Long Short-Term Memory networks), Transformers, and GAN-based architectures. Optimizers like RMSprop (Root Mean Square Propagation) and Pelican are also used. While SC-UWA favors simplicity, MC-UWA applies DL models for greater accuracy and performance. Training examples are drawn from direct UWA measurements and synthetic datasets. These datasets may be generated using statistical models [25] or sourced from real measurements [26]. The diversity and quality of training data play a vital role in model effectiveness. To summarize, SC-UWA systems prioritize efficiency with lightweight models, while MC-UWA systems employ complex architectures and diverse data sources to enhance accuracy in challenging underwater conditions.

3.2. Key Characteristics of Channel Estimation Techniques

Table 6 and Table 7 elaborate key characteristics for SC-UWA and MC-UWA systems. The discussed parameters include bandwidth, transmission–reception (Tx-Rx) distance, modulation schemes, system types, subcarriers, and CP. The tables reveal clear differences in system design and operation between SC-UWA and MC-UWA approaches. SC-UWA systems operate with bandwidths ranging from 4 kHz to 25 kHz, with transmission distances between 0.2 km and 3 km. The primary modulation schemes used are BPSK and QPSK, with some studies employing frequency hopping spread spectrum (FH-SS). In contrast, MC-UWA systems utilize higher bandwidths, ranging from 4 kHz to 1000 kHz, with transmission distances between 0.5 km and 5 km. OFDM is the dominant system, with some studies incorporating OFDMA (Orthogonal Frequency-Division Multiple Access), MIMO-OFDM (Multiple-Input Multiple-Output–Orthogonal Frequency-Division Multiplexing), AFDM, and OTFS (Orthogonal Time–Frequency Space). Modulation schemes include BPSK, QPSK, 8-QAM, 16-QAM, and 8PSK, with cyclic prefixes varying from 0 to 256. The number of subcarriers ranges from 32 to 1024, reflecting the diversity in system configurations. To summarize, SC-UWA systems offer simpler, low-bandwidth designs for short-range links, while MC-UWA systems adopt complex, high-bandwidth configurations to support robust and scalable underwater communication.

3.3. Comparative Analysis of Channel Estimation Techniques

It evaluates ML/DL-based channel estimation techniques in UWA communication using key performance metrics. These metrics reflect efficiency, accuracy, and reliability under varying underwater conditions, offering insight into each model’s practical effectiveness. Table 8 and Table 9 present the performance evaluation of SC-UWA and MC-UWA systems, respectively.

One of the key metrics for performance evaluation is training loss, which measures the training effectiveness of an algorithm. It is assessed by plotting MSE or mean absolute error (MAE) against the number of epochs during the training, testing, and validation phases. It calculates the average squared difference between predicted and actual values, providing a measure of prediction errors. The mathematical formula for MSE is [61]

MSE = \frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}

(1)

On the other hand, MAE quantifies the sum of absolute errors divided by the sample size. Its formula is [62]

MAE = \frac{1}{n} \sum_{i = 1}^{n} |y_{i} - {\hat{y}}_{i}|

(2)

Another critical parameter is complexity, which evaluates an algorithm’s efficiency compared to existing methods. A lower complexity is represented by a positive value indicating the gain in efficiency, while a higher complexity is denoted by a negative value, highlighting the algorithm’s inferiority compared to existing techniques. The computational efficiency is evaluated in terms of theoretical time complexity (expressed using Big-O notation), observed runtime, FLOPs (Floating Point Operations Per Second), the number of trainable network parameters (denoted as P in the table), and associated storage requirements. Similarly, channel prediction performance compares the predicted parameters with actual measurements using MSE, Normalized Mean Square Error (NMSE), Time-varying Impulse Response (TVIR), Cumulative Density Function (CDF), Jensen–Shannon (JS) divergence, and information entropy. Among these, TVIR is typically evaluated visually due to its graphical nature.

Lastly, BER is a widely used for assessing communication system performance. It evaluates the accuracy of receiver using channel estimates obtained through the proposed technique. The performance is presented through BER versus SNR plots, which compare the proposed method against existing techniques. Table 8 and Table 9 highlight the BER gain at a fixed SNR value relative to baseline techniques. When multiple techniques are compared, the BER gain is computed against the best-performing technique. If the authors provide multiple BER plots, the highest performance is taken, with the comparison technique’s name indicated in parentheses alongside the reported BER gain.

It can be observed from Table 8 and Table 9 that the training loss in SC-UWA systems values vary, with some studies using MAE and others using MSE. Complexity is not explicitly provided for most studies. The BER improvements range from

10^{- 3}

to

10^{- 5}

, indicating varying levels of performance enhancement. In MC-UWA systems, training loss values are mostly MSE, with some studies combining MSE and BER for optimization. Complexity varies, with some models requiring high memory usage (e.g., 20.7 MB) and millions of parameters, and others focusing on execution time (e.g., 47.8 ms per OFDM block). BER improvements range from

10^{- 3}

to

10^{- 6}

, with some studies reporting over 40% improvement in performance. To summarize, the SC-UWA and MC-UWA systems exhibit diverse training loss functions and performance gains, with BER improvements ranging from

10^{- 3}

to

10^{- 6}

, while computational complexity is inconsistently reported, spanning high memory usage, millions of parameters, and execution times such as 47.8 ms per OFDM block.

3.4. Discussion on ML/DL-Based Channel Estimation in SC-UWA Communication Systems

Research on single-carrier UWA systems shows steady progress in tackling UWA communication challenges. Each study offers distinct approaches and insights, driving forward UWA communication advancements.

The work in [17] presents ML-ECQP (Machine Learning-based Environment-aware Communication Channel Quality Prediction), a channel quality prediction method using LR. It relies on environmental inputs like wind speed, water temperature, air humidity, and SNR. The system operates at 23 kHz and 25 kHz over a 205 m Tx-Rx distance using BPSK modulation. Real data from Furong Lake helps optimize MAE to

4 \times 10^{- 3}

and achieve BER near

10^{- 3}

. The method boosts energy efficiency and network performance. However, it faces challenges under favorable channel conditions and suffers from limited training data. Building on the foundation of channel prediction, ref. [20] explores M-ary Spread Spectrum modulation with BPSK for single-carrier systems operating within a 4 kHz bandwidth and transmission distances of 2–3 km. The study employs advanced LSTM architectures, including BiLSTM and stacked bidirectional uni-directional LSTM (SBULSTM), optimized using the Adam optimizer and simulated channel data generated via BELLHOP software. The system achieves robust channel prediction and significant BER improvements, particularly under low SNR conditions, with values close to

10^{- 2}

for SBULSTM models.

The work in [21] introduces ABiGRU for real-time channel prediction. The model combines Space-Time Block Coding with MMSE pre-equalization. This setup improves BER and prediction accuracy over MMSE and LSTM. The authors in [57] present AttLstmPreNet. It enhances LSTM using attention mechanisms. The model focuses on key input features and adapts well to fast time-varying UWA channels. Simulated data shows it performs better than LMS and RLS. Both models strengthen robustness and prediction accuracy in dynamic underwater environments. Finally, the study [29] introduces UACC-GAN, a GAN-based simulator for underwater acoustic channels. It generates realistic time-varying impulse responses using measured data and achieves BER performance comparable to real-world conditions via validation on the WATERMARK dataset. While it enhances simulation capabilities for system design, limitations include high data requirements and limited control over output properties.

These studies show a clear progression in ML-based channel estimation for SC-UWA systems. Early models use environmental data and logistic regression to enhance performance, while advanced LSTM variants and attention-based networks improve prediction under low-SNR and time-varying conditions. Simulation-driven approaches like UACC-GAN further support design and validation, establishing a technically robust and adaptive communication framework.

3.5. Discussion on ML/DL-Based Channel Estimation in MC-UWA Communication Systems

Multi-carrier UWA research studies in this section offer distinct innovations that collectively advance performance dynamic environments.

The work in [18] presents CsiPreNet, a CNN-LSTM model for CSI (channel state information) prediction. It operates over a 4 kHz bandwidth and spans 1–5 km using 681 subcarriers. Supported modulation schemes include BPSK, QPSK, 8-QAM, and 16-QAM. The model achieves MAE of

4 \times 10^{- 3}

and BER near

10^{- 3}

under low SNR. The work in [22] proposes a DNN-based OFDM model using QPSK modulation. It employs 64 subcarriers and 32 pilot tones with a comb-type pilot scheme. The model is trained on the WATERMARK dataset and optimized using Adam. While evaluation in realistic underwater conditions shows improved performance and efficiency, both models rely on specific modulation formats and controlled datasets, which may limit generalization and adaptability.

In addition to the OFDM systems of [18,22], the study in [58] proposes Denoising Autoencoder (DAE) with a Gated Bernoulli–Gaussian (GBG) model to remove impulsive noise in OFDM signals. The clean data is processed by a DNN trained over 10,000 epochs to predict CSI. The approach achieves BER of

10^{- 2}

and MSE of 0.0024 at 20 dB SNR. Similarly, the work in [23] presents a CNN-based OFDM receiver with skip connections and QPSK modulation. It delivers BER gains over 0.17 and SNR improvement of 2–3 dB at

1 \times 10^{- 3}

. The model runs efficiently with reduced storage, validated using the WATERMARK dataset. Despite strong performance, both models depend on specific noise assumptions and channel conditions, limiting adaptability across highly variable underwater environments.

Further advancing channel estimation, a 2D BiLSTM-based estimator using water temperature data for SSP (Sound Speed Profile) estimation is presented in [24]. With an OFDM system with 462 subcarriers and 22.6 ms cyclic prefix, it achieves low MSE (0.02–0.14) and a BER gain of 2–3 dB SNR at

1 \times 10^{- 3}

, with 396 bps throughput. A model-based approach of [28] presents the GM-LAMP network. It builds on approximation message passing (AMP) and Gaussian mixture priors. Another deep learning-based receiver for UWA-OFDM, known as SCABNet, is presented in [59]. It uses an attention-enhanced bi-directional LSTM (AM-BiLSTM) for signal detection and a skip connection CNN (SC-CNN) for channel estimation. The model is trained on real experimental UWA channel data from the WATERMARK dataset [26] and evaluated under various conditions. By achieving lower BER, it demonstrates resilience in time-varying and frequency-selective channels but requires substantial offline training, which may demand significant computational resources. However, its applicability to other underwater environments beyond the WATERMARK dataset remains unexplored.

The CWGAN (Conditional Wasserstein Generative Adversarial Network) model for channel estimation is presented in [60] using channel attention denoising (CAD). Performance evaluation using MSE, BER, and channel prediction accuracy confirms enhanced signal robustness. Nevertheless, its reliance on specific parameter configurations, computational overhead from CAD and CWGAN-GP modules, and limited environmental diversity may restrict generalization. A Bi-directional Deep Pelican Convolutional Neural Network (BDPCNN) is presented in [27]. The system’s performance is evaluated under different water conditions. The findings indicate lower BER values (0.0086–0.021) at 20 dB SNR, improved MSE-based channel estimation, and better energy efficiency compared to traditional models. However, the model is trained on simulated data, which may not fully represent real-world underwater channel variations. A Multi-Task Learning framework is presented in [32]. The framework uses a shared feature learning layer to capture multipath correlations and a task-specific head layer for refining predictions. The study compares different shared feature learning configurations. It achieves lower prediction errors in both one-step and multi-step forecasting. While it reduces computational complexity compared to single-task learning, advanced models like transformers still require significant resources.

The work reported in [30] addresses large Doppler spread, low SNR, and complex multipath propagation in UWA OFDM communication with DenseNet-based channel estimation. Due to extensive connection and feature reuse, the proposed DenseNet estimator efficiently captures complicated channel characteristics. The model is trained on WATERMARK BCH, KAU1, and KAU2 channel data [26]. Even with fewer pilot symbols, DenseNet surpasses conventional estimators in BER, MSE, and channel estimation accuracy. The study also shows the model’s flexibility to BPSK, QPSK, and environmental circumstances. DenseNet’s BER improvements (up to 96.3% over LS and 94.2% over MMSE) make it a reliable UWA channel estimate solution. The model reduces pilot overhead and errors in amplitude and phase estimation.

A model-driven DL approach in [31] employs WATERMARK datasets [26] to train and test the model. The superiority of the model in terms of achieved BER over Least Squares, MMSE, CNN-MLP (Convolutional Neural Network-Multi-Layer Perceptron), DNN, and ComNet (Communication Network) is proved through simulations. However, scalability, Doppler resilience, and hardware deployment remain unexplored. An iteration-based design in [33] achieves channel estimation with a 0.001 BER within 0.5 dB of the ideal case. However, the model requires extensive offline training. It involves high computational complexity in iterative processing. Finally, a stacked CNN ResNet (S-CNN-ResNet) receiver for OTFS communications is presented in [34]. It integrates CNN for channel feature extraction from pilot data and an enhanced ResNet for symbol recovery to improve feature learning. While the model achieves better BER performance, it requires substantial training data and processing resources, which may limit its feasibility for real-time applications.

To summarize, studies discussed in this section offer useful solutions for MC-UWA systems. Deep learning models like CNN-LSTM, DAE, BiLSTM, DenseNet, and ResNet show strong improvements in BER and MSE. Many models perform well under low SNR and complex channel conditions. However, most rely on limited datasets like WATERMARK and need heavy offline training. This affects real-time use and adaptability to new environments. Some approaches also have high computational costs. To move forward, future research should focus on simpler, scalable models. Testing under diverse underwater conditions is essential to build reliable systems.

4. Results on ML/DL-Based Adaptive Modulation

This section analyzes ML/DL techniques applied to adaptive modulation in UWA communication systems. The analysis includes an overview (Section 4.1), system characteristics (Section 4.2), and performance comparison (Section 4.3) for SC-UWA (Section 4.4) as well as MC-UWA systems (Section 4.5).

4.1. Overview of Adaptive Modulation Strategies

An overview of ML/DL techniques, optimizers, and training examples for single-carrier and multi-carrier UWA communication systems, implementing adaptive modulation, is presented in Table 10 and Table 11. The detailed descriptions of the various parameters listed in these tables were provided earlier in Section 3.1. Notably, RL methods like table lookup and Q-learning do not use typical optimizers. Instead, they follow rules from the Bellman equation to adjust learning step by step.

It can be observed from Table 10 and Table 11 that SC-UWA systems utilize ML models such as SVM, KNN, LDA, BRT, MLR, MLP, CNN, and RL for measured as well as simulated datasets. In some cases, no dataset is used, particularly in RL approaches. In MC-UWA systems, adaptive modulation techniques involve models like A-kNN, CNN, and RL-based approaches. Training examples range from measured data to simulated datasets, with Bellhop being a common simulation tool. Some RL studies do not use a dataset, focusing instead on theoretical evaluations. Overall, SC-UWA techniques employ a mix of traditional ML models and deep learning approaches, while MC-UWA techniques integrate more RL methods and simulation-based training. The diversity in optimization strategies and training datasets highlights the evolving nature of adaptive modulation in UWA communication. To summarize, SC-UWA systems for adaptive modulation use a mix of traditional ML and DL methods. They often rely on measured data. Some RL approaches skip datasets and use theory. MC-UWA systems use more simulation and RL methods. Bellhop is commonly used for channel modeling. Training sources range from real to synthetic data. This reflects the variety in adaptive modulation strategies for UWA communication.

4.2. Key Characteristics of Adaptive Modulation Techniques

Table 12 and Table 13 provide a comparison of the key characteristics for selected adaptive modulation techniques in SC-UWA and MC-UWA communication systems, respectively. Similar to Table 6 and Table 7, the comparisons include bandwidth, transmitter–receiver (Tx-Rx) distance, modulation schemes, system types, subcarriers, and CP attributes.

To summarize, adaptive modulation techniques in SC-UWA and MC-UWA systems offer diverse configurations suited for varying underwater conditions. SC-UWA studies explore PSK, QAM, CDMA, and FH-BFSK across 0.82–3 km distances and 4–10 kHz bandwidths, though some lack full specifications. MC-UWA systems, mainly OFDM and OTFS, utilize 32–1024 subcarriers, 5–8 kHz bandwidths, and cyclic prefixes from 64 to 400 samples. Despite robust modulation strategies, many studies omit key parameters, highlighting the need for standardized benchmarks across realistic underwater scenarios.

4.3. Comparative Analysis of Adaptive Modulation Techniques

Table 14 and Table 15 present the evaluation of systems implementing adaptive modulation, categorized into SC-UWA and MC-UWA systems, respectively. The corresponding attributes have already been explained in Section 3.3. In UWA communication, throughput reflects the average number of successfully received bits per unit time, factoring in protocol overhead and errors. Accurate channel classification ensures optimal modulation decisions, boosting data rates and reducing retransmissions. Low precision, however, leads to errors and underutilized bandwidth. Thus, the throughput values in Table 14 and Table 15 denote the accuracy of selecting the most suitable modulation scheme.

In single-carrier systems, training loss values are often not provided, with some studies using mean squared error or threshold-based approaches. Complexity is generally unspecified, but throughput improvements range from 3.6% to over 99% accuracy. The BER gains vary, with some studies reporting substantial improvements while others provide specific values. In multi-carrier systems, training loss values include actor and critic loss, CE, and mean squared error, though some studies do not specify a loss function. Complexity varies, with some models having higher processing and memory demands. Throughput improvements range from near ideal performance to specific percentage increases, such as four percent higher accuracy. The BER gains are generally better than threshold values, with some studies maintaining a stable BER of 0.001 or reporting improvements of up to 32 percent.

Overall, SC-UWA systems use ML techniques like SVM, KNN, LDA, MLP, CNN, and RL. They rely on measured data, simulated data, or no dataset in the case of RL. MC-UWA systems use adaptive modulation with A-kNN, CNN, and RL. Training sources vary, and Bellhop is commonly used. RL-based approaches often focus on theoretical learning. SC-UWA focuses on real data and classical models. MC-UWA emphasizes reinforcement learning and simulation. This shows a shift toward smarter, flexible methods in UWA communication.

4.4. Discussion on Adaptive Modulation Techniques in SC-UWA Communication Systems

This section discusses the research studies, targeting adaptive modulation techniques in SC-UWA communication systems. An overview of the key characteristics and performance comparison for these studies are summarized in Table 10, Table 12, and Table 14, respectively.

An ML-driven link adaptation strategy is proposed in [43]. Different ML algorithms including SVM, KNN, pseudo-linear discriminant analysis, and boosted regression trees have been used to classify modulation and coding scheme by analyzing measured sea trial datasets. The boosted regression tree attains exceptional accuracy (99.97%), surpassing alternative techniques. Nonetheless, the methodology is dependent on substantial training data, and its real-time application is limited. Building on the advantages of ML-based link adaptation, the research in [36] presents an iterative learning framework for dependable link adaptation in the Internet of Underwater Things (IoUT). By utilizing multilayer regression (MLR) and multilayer perceptron (MLP) to forecast modulation and coding scheme (MCS) and BER, it employs various channel parameters (e.g., SNR, Delay Spread, and Frequency Spread) derived from authentic underwater datasets. The achieved BER is up to 25% increased throughput compared to traditional SNR-based adaptive modulation. Nonetheless, there is a need for additional optimization to minimize latency in real-time IoUT implementations.

Building upon the iterative ML frameworks discussed previously, the authors in [38] present a hybrid DL model that integrates CNN with Boosted Single Feedforward Layers (BSFLs) to dynamically choose among CDMA, TDMA, and OFDM modulation methods in underwater acoustic networks. The CNN collects channel features, whereas the BSFL forecasts the ideal modulation scheme, with a high accuracy of 98.6% and a 30% enhancement in BER performance relative to traditional approaches. Significant achievements encompass illustrating the efficacy of hybrid learning in dynamic underwater settings and surpassing established models such as CNN+RF and DCNN in modulation selection. The method necessitates substantial processing resources and enormous datasets for training, while its real-time applicability is constrained by significant complexity.

While the prior study employs hybrid DL, the research in [41] shifts focus to RL and present an RL-based automatic modulation switching system designed to improve UWA communication by dynamically selecting among ASK, PSK, OFDM, and BFSK schemes according to real-time channel circumstances. The achieved performance numbers are as follows: a 3.648% enhancement in RSSI, a 32% decrease in BER at 7 dB SNR, and a 5% augmentation in utility at 10 dB SNR relative to fixed FH-BFSK. Nonetheless, the system’s efficacy is constrained by the confined experimental framework (0.05–0.1 m water range), and its practical scalability has yet to be validated. Expanding on RL-based adaptive strategies, a Q-learning-based adaptive modulation scheme for shallow sea UWA communication can be found in [42]. Field experiments demonstrated that the RL approach outperformed fixed threshold and random selection methods, achieving higher throughput (14,645.3 bits) and lower BER in time-varying channels. Key contributions include the practical validation of RL in real-world UWA conditions and the demonstration of its superiority over conventional adaptive modulation strategies. However, shortcomings include limited scalability to diverse environments due to site-specific training data, potential latency in real-time decision-making, and the lack of comparison with more advanced RL algorithms beyond Q-learning.

The reviewed SC-UWA studies highlight multiple strategies for adaptive modulation using ML, DL, and RL techniques. Traditional models like SVM and boosted trees show high accuracy but depend on extensive datasets. Iterative learning with MLP and MLR improves throughput but needs faster decision-making for real-time use. Hybrid deep learning models offer strong performance in BER but involve heavy computational complexity. RL-based methods adapt to changing conditions and show promising results in shallow water trials. However, these methods often face scalability issues and require careful validation in varied underwater environments. Together, these approaches reflect an ongoing effort to balance performance, adaptability, and efficiency in underwater communication.

4.5. Discussion on Adaptive Modulation Techniques in MC-UWA Communication Systems

This section discusses the research studies, targeting adaptive modulation techniques in MC-UWA communication systems. An overview of the key characteristics and performance comparison for these studies are summarized in Table 11, Table 13 and Table 15, respectively.

A-kNN classifier in [35] employs attention mechanisms to improve selection accuracy. It enhances efficiency through principal component analysis-based dimensionality reduction and k-means clustering. The framework includes online learning for adaptability to new environments, validated using real-world lake data. Simulations employing real-world data from three lake experiments show that these approaches outperform model-based methods in throughput and dependability. However, the reliance on manually extracted features may limit performance, and the computational overhead of kNN for large datasets remains a challenge. Expanding on the attention-based classification approach, a proximal policy optimization (PPO)-based adaptive modulation system for OFDM is described in [37]. The approach attains near-optimal throughput, surpassing DQN and Double DQN (DDQN) in both convergence velocity and stability. Drawbacks encompass dependence on precise channel feedback, elevated computational complexity stemming from the PPO recurrent updates.

Building on the use of PPO, a deep reinforcement learning framework is proposed in [39]. In this work, a DRL framework employs DQN to adaptively pick modulation schemes in OFDM systems. It maximizes system throughput while maintaining acceptable BER. However, the simulation results may not fully replicate the complexity of real-world underwater ecosystems, limiting the generalization of the findings. Extending beyond conventional OFDM-based approaches, an OTFS modulation with DL and meta-learning techniques is described in [40]. The approach addresses the challenges of rapidly varying UWA channels by using the robustness of OTFS modulation. To enhance adaptability in data-scarce environments, it integrates model-agnostic meta-learning techniques, enabling faster generalization to new and unseen underwater scenarios. Although, the strategy surpasses conventional ML-based adaptive modulation, it does not provide a discussion on computational complexity and real-time implementation problems, which may be essential for actual deployment.

The reviewed MC-UWA adaptive modulation methods show a shift toward intelligent and data-driven solutions that improve throughput and reliability in underwater settings. Attention-based classifiers like A kNN improve adaptability and feature selection, but face challenges due to manual input and processing demands. RL frameworks such as PPO, DQN, and DDQN offer better convergence and throughput, yet require accurate feedback and high computational power. Techniques combining OTFS with meta-learning tackle channel variability and limited data but often overlook practical concerns. Many studies lack analysis on real-time performance, resource efficiency, and scalability. Future research should prioritize lightweight designs, flexible learning strategies, and broad testing for real-world deployment.

5. Results on ML/DL-Based Modulation Recognition

This section discusses the results of ML and DL techniques for modulation recognition in UWA communication systems. First, Section 5.1 outlines the methods used across selected studies. Second, Section 5.2 elaborates system characteristics for each approach. Subsequently, Section 5.3 compares their performance to show strengths and limitations. Finally, Section 5.4 discusses the contributions and impacts of these studies on modulation recognition.

5.1. Overview of Modulation Recognition Techniques

An overview of the ML/DL techniques, optimizers, and training examples for UWA communication systems is presented in Table 16. Since the same UWA modulation recognition system can recognize both single-carrier and multi-carrier modulations, we did not make separate tables for single- and multi-carrier systems. The descriptions of various parameters listed in this table were provided earlier in Section 3.1.

It can be observed from Table 16 that various models such as Sequence Convolutional Network (SCNet), RNN, CNN, ResNet, SVM, and RL are applied across different studies. Optimizers include Adam, GD, and Momentum Stochastic Gradient Descent (MSGD), while some studies do not specify an optimizer. Training examples vary, with some studies using simulated data, others relying on measured data, and a few combining both approaches. Bellhop is a common simulation tool used in some cases. The diversity in models, optimization strategies, and training datasets highlights the evolving nature of modulation recognition techniques in underwater acoustic communication. To summarize, the range of models, optimizers, and training approaches in Table 16 reflects a shift toward data-driven modulation recognition methods, emphasizing both classification accuracy and adaptability across diverse underwater scenarios.

5.2. Key Characteristics of Modulation Recognition Techniques

Table 17 provides a comparison of the key characteristics of modulation recognition (MR) techniques in a typical UWA communication system. These characteristics were already explained in Section 3.2. Table 17 shows that the bandwidths vary across studies, ranging from 1 kHz to 100 kHz, with some studies not specifying bandwidth values. Transmission distances also differ, spanning from a few meters to several kilometers, depending on the experimental setup. The modulation schemes considered include PSK, QAM, FSK, DSSS, OFDM, LFM (Linear Frequency Modulation), CW, and SSB (Single Sideband), showing a wide range of modulation types used in underwater acoustic communication. Some studies focus on a limited set of modulation schemes, while others evaluate multiple types. In summary, these MR methods span 1 kHz–100 kHz bandwidths, cover ranges from meters to over 5 km, and support diverse modulation formats, demonstrating their versatility in UWA systems.

5.3. Performance Comparison of Modulation Recognition Techniques

Table 18 presents how well modulation recognition techniques perform in UWA communication. Key metrics include training loss, complexity, accuracy, and precision. The training loss and complexity were explained earlier. Average accuracy shows the overall rate of correct predictions across all modulation types. It is calculated as the ratio of correctly predicted signals to the total number of signals tested:

Accuracy = \frac{Correct Predictions}{Total Predictions}

(3)

Precision reflects how reliable the system is when identifying a specific modulation type. It is given by

Precision = \frac{TP}{TP + FP}

(4)

where TP is true positives and FP is false positives. A confusion matrix is used to compute these values. It shows how many predictions are correct or incorrect for each class and evaluates classifier performance in modulation recognition and other tasks.

The models in Table 18 demonstrate a clear trade-off between accuracy and complexity. Lightweight classifiers (such as SVM and compact CNNs) run in under 5 ms and use fewer than 200 K parameters, yet still achieve over 98% accuracy and near-perfect precision. Mid-range hybrids and contrastive-learning networks push accuracy above 99% (with some even reaching 100%), but they incur higher computational costs. At the extreme, NAS-driven and transformer-style architectures require millions of parameters and seconds of processing time for only modest accuracy gains. Models based on deep complex networks or pure contrastive loss show wider accuracy variability (64–73%), reflecting sensitivity to signal conditions and feature design. Overall, these results highlight the need to balance model size, inference speed, and classification performance for real-time modulation recognition in underwater acoustic systems. Overall, these modulation recognition methods achieve high accuracy (often >90%) and precision (up to 100%), with low-complexity models enabling efficient real-time use and deep hybrid architectures delivering marginal improvements at higher computational cost.

5.4. Discussion on ML/DL-Based MR Techniques in UWA Communication Systems

The following paragraphs explore the recent advancements in ML/DL techniques applied to UWA modulation recognition.

The SCNet-based schemed is presented in [49]. The main feature of this work is to employ 1D sequence convolutions with adaptive kernel sizes. It achieves higher recognition accuracy, faster training, and fewer parameters compared to traditional CNN and RNN models. Tested on real-world underwater data, SCNet outperforms models like LSTM, ResNet, and DenseNet. While primarily validated on simulated data, the results suggest strong potential for practical deployment in challenging underwater environments. Extending beyond convolutional models, the work in [50] combines optimizing autoencoder (OAE) with an evaluation-enhanced K-nearest neighbors (EEKNN) algorithm. The OAE refines noisy signal features by learning their relationship to ideal ones. Meanwhile, EEKNN improves the classification process. Together, these techniques achieve up to 99.25% accuracy and very fast recognition times (3.48 ms) on real-world data from the South China Sea. While the method shows great promise, especially for practical applications, it would benefit from further testing in more diverse and real-time underwater conditions.

While the work in [50] has focused on enhancing denoising and classification, the authors in [51] have proposed R&CNN. It combines RNN and CNN. The model effectively captures both the time-dependent and spatial features of acoustic signals. Tested on two real-world datasets (Trestle and South China Sea), R&CNN delivers accuracy (up to 99.38%) and fast recognition times (7.164 ms). While the results are promising, further validation under more diverse underwater conditions is needed before confirming its practical deployment. Building further on hybrid architectures, the work in [52] introduces UWA communication modulation classifier-supervised contrastive learning (UMC-SCL). The method starts by using a lightweight CNN to filter out ocean noise. It then employs ResNet50 as a feature extractor. Tested on a mix of simulated, pool, and real ocean data, the model achieves strong performance, reaching 98.6% accuracy at 0 dB, and shows clear advantages over traditional methods. Although highly promising, further testing in more diverse underwater conditions would help confirm its robustness and generalization.

Departing from deep neural networks, in [53], the authors present a sixth-order cumulant (

C_{63}

) to effectively separate OFDM signals from PSK and FSK variants, while an enhanced bispectrum approach helps differentiate between specific modulation types like BPSK, QPSK, 2FSK, and 4FSK. The system shows impressive performance in challenging non-cooperative environments, achieving perfect recognition at 0 dB SNR in simulations and nearly 99% accuracy in real-world lake tests using a ResNet-based classifier. While effective, the method struggles in extremely noisy conditions (below −8 dB SNR), relies heavily on simulated training data, and faces computational challenges for real-time applications.

While cumulant and bispectrum-based methods offer mathematical precision, an edge-enabled adaptive modulation framework for Internet of Underwater Things (IoUT), leveraging network pruning and EL to balance computational efficiency and accuracy is presented in [54]. The key contributions include (1) a novel CSI dataset for six modulation schemes, enhancing feature representation under noise; (2) a Taylor expansion-based pruning criterion to reduce redundant CNN parameters while maintaining performance; (3) an ensemble learning (EL) strategy to compensate for accuracy loss post-pruning, achieving 93.4% accuracy at 5 dB SNR; and (4) successful deployment on edge devices like NVIDIA Jetson TX2, demonstrating practical feasibility. However, the framework struggles with SNR below 0 dB, where feature extraction becomes unreliable, and its reliance on simulated data may limit real-world adaptability in highly dynamic underwater environments.

In contrast to real-valued pruning and ensemble techniques, the paper [55] proposes a novel adaptive modulation method for UWA communication signals utilizing deep complex networks (DCNs). Their key contribution lies in developing a DCN architecture that directly processes complex-valued UWA signals, effectively capturing amplitude and phase information, which is crucial given the complex channel impairments in UWA environments. This approach aims to improve classification accuracy and robustness compared to traditional real-valued neural networks that often discard or separately process the complex nature of the signals. A potential shortcoming, though not explicitly detailed in the abstract, could be the computational complexity associated with DCNs, especially when deployed in resource-constrained underwater environments, and the need for extensive training data specific to diverse UWA channel conditions.

Exploring hardware-integrated solutions, in [56], the authors present a support vector machine (SVM)-powered underwater acoustic modem that integrates continuous wavelet transform (CWT)-based feature extraction and FPGA-based signal processing to enhance UWA communication. Key contributions include (1) a novel system architecture combining SVM for signal classification and CWT for robust feature extraction, achieving 98.28% accuracy at 5 dB SNR; (2) the introduction of a transitional “C” symbol to mitigate spectrum leakage and improve demodulation reliability; and (3) FPGA implementation for real-time processing, demonstrating a stable 10,000 baud rate with zero BER under controlled conditions. However, the computational complexity of CWT and SVM could pose challenges for low-power edge devices.

Returning to hybrid deep learning models, the paper in [46] proposes a hybrid neural network model S&SEFM, combining SqueezeNet and SENet, for modulation recognition in UWA communication. It introduces multi-attribute features, wavelet time–frequency (WTF) spectrum, square power spectrum, and cyclic spectrum contour maps, to mitigate the limitations of single-feature methods and employs multi-scale feature fusion to enhance recognition accuracy. The model demonstrates strong generalization across different UWA channels and robustness against Doppler shift, achieving high recognition rates in both simulated and sea trial data. However, the paper does not address computational complexity in real-time applications or the model’s performance in extremely low SNR conditions.

Continuing the pursuit of multi-scale feature extraction, Wang et al. [47] introduce several contributions. Key contributions include a data augmentation method that increases data sevenfold to address small sample size issues, a novel “microscale” concept for rationalizing UWA signals into time series, and the “One2Three block” temporal feature extractor designed to extract features from three microscales. Additionally, they propose a “Dual-Stream SE block” as a spatial feature extractor to synthesize advanced spatial features. The method’s effectiveness is validated on real-world datasets from the South China Sea and the Yellow Sea, demonstrating promising recognition accuracy for eight common UAC modulation modes. The computational complexity introduced by the multi-microscale feature extraction and the dual-stream architecture might be a concern for real-time deployments.

Taking a step further into attention mechanisms, a two-stream transformer (TSTR)-based network is proposed in [48], aiming to overcome the challenges of complex UWA channels and severe ocean noise. Their key contributions include an input preprocessing layer that extracts I/Q and time–frequency features, a feature capture layer (FCL) for extracting high-dimensional signal features across time, frequency, and time–frequency domains, and a classification layer for modulation estimation. A notable innovation is the use of a multihead self-attention module with adaptive soft thresholding within the FCL to handle noise and varying feature characteristics. The computational intensity of the transformer architecture, particularly the multihead self-attention, might pose challenges for real-time implementation on resource-constrained UWA platforms.

To automate and optimize model design, authors in [44] propose neural architecture search (NAS). It introduces a feature fusion method combining time–frequency and cyclic spectrum features with an attention mechanism to enhance phase-modulated signal recognition. Focusing on enhancing representational diversity through feature fusion, the work in [9] proposes a multi-scale feature fusion hybrid model (HM) for UWA communication signal modulation recognition. Their primary contribution is the integration of Gram angle field (GAF), Markov transition field (MTF), and recurrence plot to fuse time and frequency domain features into low-dimensional representations. This approach demonstrates superior recognition accuracy (94.31% in lake trials) as validated by comparative experiments. A potential shortcoming could be the computational overhead associated with generating and fusing these multiple feature representations which might impact real-time deployment in resource-constrained UWA systems.

Finally, complementing deep learning with signal processing and channel estimation, Yang et al. (2024) [45] propose a modulation classification method for non-cooperative UWA communication by integrating channel estimation to mitigate signal distortion caused by multipath fading and environmental noise. It leverages higher-order cumulants as features and employs various classifiers (e.g., SVM, GBDT, and XGBoost) to validate the approach, demonstrating improved recognition accuracy after signal restoration. The method is tested on both simulated and real-world datasets, showing robustness across different SNR conditions. The method’s performance depends on the accuracy of channel estimation, which can be compromised by measurement errors.

The reviewed studies present DL models like SCNet, R&CNN, and UMC-SCL to show high accuracy and low latency but often require large datasets and have complexity concerns. Signal-processing techniques using cumulants and bispectrum offer precision, though they may struggle under noisy conditions. Edge-focused approaches balance speed and efficiency but face limitations at low SNR. Advanced architectures like DCNs, transformer-based models, and NAS improve feature handling but add computational overhead. Hybrid models combining feature fusion and attention mechanisms perform well in real data. Overall, these methods reflect progress toward accurate adaptive modulation recognition, though real-time deployment and environmental generalization remain key challenges.

6. Challenges and Future Research Directions

This article has shown that several ML and DL methods have been successfully applied to UWA systems for channel estimation, adaptive modulation, and modulation recognition. Although these methods have demonstrated promising performance, several open challenges and future research directions remain. Based on the studies referenced from [9,20,23,27,28,29,30,31,33,34,36,38,40,45,46,47,53,55,59,61], the following key challenges and future research directions have been identified.

6.1. Challenges in ML/DL-Based UWA Communication

The following discussion outlines the key issues that currently limit the scalability, generalization, and fair assessment of ML/DL-driven UWA systems. Consequently, it highlights areas that warrant deeper investigation to support reliable implementation.

Computational Complexity and Real-Time Processing: ML/DL models such as DenseNet and transformer-based architectures [30,31] demand intensive computational resources during both training and inference phases. This complexity restricts their deployment in underwater environments, particularly on resource-limited autonomous underwater vehicles (AUVs) and mobile nodes. The real-time execution of such models remains problematic due to constraints on onboard memory and processing throughput. Furthermore, GAN-based simulators [29] require expansive datasets and computational power to realistically emulate channel dynamics. These challenges are amplified in scenarios where real-time Doppler compensation is critical for high-speed mobile platforms.
Limited Training Data and Generalization: Robust ML/DL-based UWA communication models require comprehensive labeled datasets to generalize effectively across diverse environments. However, collecting underwater data is challenging due to fluctuating parameters such as temperature, salinity, and pressure [20,27,45]. These environmental variations often lead to poor generalization across geographic locations and seasonal conditions. Hybrid CNN-RNN models frequently struggle to transfer learned representations reliably. Physics-Informed Neural Networks offer a promising direction by incorporating environmental constraints into learning pipelines [34]. Nevertheless, these models rely on specialized datasets that accurately reflect underwater propagation dynamics, and such datasets remain scarce.
Multipath Propagation and Doppler Effects: Acoustic signals often encounter severe multipath effects in underwater channels. These reflections cause phase distortion and signal fading [23,45,59]. The rapid movement of AUVs and underwater robots introduces Doppler shifts. These shifts complicate synchronization and modulation decoding. Attention mechanisms and BiLSTM architectures improve temporal adaptation. However, they still struggle with real-time compensation under extreme Doppler conditions. CNN-LSTM models offer better multipath tracking. Yet, they face limitations in maintaining synchronization when channel conditions change rapidly.
Energy Efficiency and Hardware Constraints: Underwater communication nodes often rely on limited battery power. This makes energy efficiency a top priority [28]. High-complexity models, such as deep CNNs, consume substantial energy. They are not ideal for long-duration missions. Lightweight and pruned models reduce energy demands [47,54]. However, their accuracy may decline due to reduced model capacity. Deploying DRL models on low-power acoustic modems remains challenging. These models require optimization in compression and hardware-aware training. Moreover, resource allocation strategies that balance bandwidth, communication distance, and data rate are underexplored. This area offers important opportunities for future research.
Fragmentation in Performance Metrics: Performance evaluation in UWA research is highly inconsistent. Different studies report metrics like BER, Accuracy, and MSE. However, they do so under varying conditions. These variations include differences in SNR ranges, acoustic channel models, datasets, and modulation schemes. There is no standard protocol for testing or reporting. As a result, comparing models becomes difficult. A technique may seem better simply due to the evaluation setup, not because it truly performs better. Many studies also lack baseline comparisons with conventional methods. This makes it hard to judge how much ML/DL methods actually improve performance. Without consistency, performance trends across tasks are unclear. It becomes impossible to rank algorithms or choose them for real-world deployment. The absence of unified benchmarks and reporting practices weakens the interpretability of results. It also limits reproducibility and cross-study insights, slowing progress in UWA communications.
Robustness Challenges in Real-World UWA Environments: Although several ML/DL models report high accuracy and BER reduction, they often perform under constrained experimental conditions. There is limited testing under unknown or changing channel conditions, including variations in Doppler spread, delay, ambient noise, or multipath severity. The absence of robustness testing, especially sensitivity to parameter perturbations, impedes real-world applicability. Furthermore, most studies do not assess generalization across different acoustic platforms or frequency bands, making cross-modem performance unreliable. The lack of domain-shift resilience and scarce labeled datasets further restricts adaptive performance across environments.
Algorithmic Constraints vs. UWA Deployment Realities: ML/DL models for UWA often demand substantial computation and high-quality training data, but these needs conflict with the strict energy budgets and latency constraints of real deployments—especially battery-powered AUVs and embedded sensors. Many architectures lack support for rapid physical-layer inference like low-latency channel estimation, which is vital for coherent demodulation. Sensitivity to assumptions like fixed Doppler profiles and ideal noise levels further amplifies the Sim2Real gap, as sudden interference or biological noise causes unpredictable degradation. Without optimization for real-time and low-power settings, these models remain difficult to deploy in mutable marine environments.
Adaptability Limits in Time-Varying UWA Channels: Many reviewed models lack dynamic adaptability when faced with fast time-varying underwater channels. Traditional architectures are often statically trained and underperform as real-world conditions evolve, especially under abrupt changes in Doppler spread or delay characteristics. The absence of real-time update mechanisms—such as online learning, incremental learning, or meta-learning—prevents responsive adaptation, contributing to instability in unpredictable environments. These constraints inhibit timely physical-layer decisions, degrade modulation accuracy, and ultimately affect system reliability.

Addressing these challenges is crucial for advancing AI-driven UWA communication systems. Future research should focus on developing lightweight and adaptive models, enhancing the generalization of techniques, improving robustness against multipath and Doppler effects, and designing energy-efficient algorithms for deployment in underwater networks. By overcoming these limitations, AI-powered underwater communication can become more reliable, scalable, and efficient, supporting applications such as marine research, deep-sea exploration, and naval security.

6.2. Future Research Directions

Future research directions can be categorized into the following key areas:

Development of Lightweight ML/DL Models [9,38,47]: As underwater communication devices operate in resource-constrained environments, optimizing models for real-time edge computing is crucial. Current architectures such as deep CNNs and RNNs often require extensive computational power, making deployment on low-power underwater sensors challenging.
–
Future research should focus on knowledge distillation, a technique that transfers knowledge from larger, complex models to smaller, more efficient models while retaining performance.
–
Quantization techniques should be explored to reduce model precision requirements, allowing ML/DL systems to run efficiently on energy-limited platforms.
–
Incorporating pruning methods can further reduce computational demands by eliminating redundant parameters in deep learning models.
Integration of Physics-Informed Neural Networks (PINNs) [33,61]: PINNs combine traditional ML/DL approaches with domain-specific physics principles, enabling more accurate channel estimation and modulation schemes in underwater environments.
–
Future hybrid PINN-ML architectures can improve adaptive modulation by integrating wave propagation models into training data, reducing error rates caused by environmental distortions.
–
By employing PINNs, AI models can learn underwater acoustic behaviors, compensating for multipath propagation, Doppler effects, and fluctuating water conditions.
Standardized Evaluation and Visual Benchmarking:
–
Future research must prioritize the creation of standardized evaluation platforms. These platforms should include unified protocols for SNR ranges, channel models, and benchmark datasets. Comparative baselines (such as MMSE, and shallow ML models) should also be embedded to provide clear reference points.
–
Performance outcomes like BER, complexity, latency, and accuracy should be visualized using box plots, radar charts, and distribution diagrams. This will enable researchers to draw meaningful comparisons and detect outliers or architectural trends more effectively. Such efforts will foster reproducibility, streamline cross-model evaluations, and guide researchers toward truly robust and scalable solutions.
–
To enable more intuitive insights and capture comparative strengths, visual aggregation methods such as box plots and radar charts should be considered. These can consolidate accuracy, BER, complexity, and latency across models tested under shared conditions (e.g., same SNR, same dataset), making trends and anomalies readily visible.
Advancements in Reinforcement Learning for Adaptive Modulation [31,40]:
–
Adaptive modulation techniques allow communication systems to dynamically adjust transmission parameters based on real-time channel conditions.
–
RL-based methods such as Q-learning and Deep Q-Networks (DQN) optimize transmission power and modulation schemes to maximize throughput while minimizing BER.
–
Meta-learning approaches can further speed up model adaptation, enabling systems to rapidly respond to varying underwater environments without requiring extensive retraining.
Improved Modulation Recognition Techniques [46,52]: Effective modulation recognition is essential for signal demodulation in UWA communication systems.
–
Contrastive learning-based models enhance classification accuracy in low-SNR conditions, distinguishing modulation patterns in noisy underwater environments.
–
Hybrid CNN-RNN architectures provide better temporal and spatial feature extraction, improving modulation classification accuracy under extreme multipath interference.
Expansion to Novel Modulation Schemes [36,53]: Current ML/DL applications largely focus on conventional single-carrier systems and OFDM-based techniques.
–
Future research should explore index modulation, which improves spectral efficiency by encoding information in the position of active subcarriers.
–
Non-Orthogonal Multiple Access can enhance capacity by allowing multiple users to share the same frequency resource, increasing efficiency in underwater communication.
–
Generalized Frequency Division Multiplexing provides flexible subcarrier arrangements that mitigate interference, improving performance in dynamic underwater environments.
Transfer Learning, Domain Adaptation, and Cross-Platform Resilience: Future research should include structured robustness evaluation frameworks to test model behavior against channel perturbations and unpredictable underwater dynamics. Techniques such as transfer learning and domain adaptation offer promising solutions to improve cross-platform generalization. TL can help reuse pretrained representations in new environments with limited data, while DA adapts models across datasets with different distributions. Benchmarking models across different modem hardware and frequency bands will also be essential for scaling up reliable, AI-enabled underwater acoustic systems.
Compact Models and Real-Time Strategies: To support edge and federated UWA computing, future work must address model size and execution speed. Compression techniques like pruning, quantization, and distillation help reduce power consumption and memory load [63]. These are vital for low-resource devices in remote underwater environments. For latency-sensitive tasks, streamlined feature sets and efficient computational graphs can minimize delays. FPGA-based processing and AI chips offer real-time capability while conserving energy. Clear complexity-performance trade-offs should guide model selection for embedded use. In federated setups, lightweight models allow distributed training and updates without overwhelming edge devices. Together, these strategies make learning-based systems feasible for scalable and responsive underwater networks.
Learning-on-the-Fly and Intelligent Resource Allocation: Future UWA systems should explore real-time learning paradigms—including online, incremental, and meta-learning techniques—to enable continuous model refinement during deployment. Moreover, intelligent resource allocation remains underexplored: RL and game-theoretic frameworks offer potential for optimizing spectrum usage, power control, and adaptive modulation/coding under varying bandwidth-distance constraints. Integrating these strategies with channel estimation and feedback loops can lead to joint optimization pipelines that maximize throughput and resilience in complex marine networks.

These future research directions highlight the need for efficient, scalable, and adaptive models to enhance underwater acoustic communication systems. By addressing these challenges, AI-driven methodologies will pave the way for more reliable, energy-efficient, and intelligent UWA communication networks, supporting critical applications such as deep-sea exploration, naval security, and marine research.

7. Responses to Formulated Research Questions

This section addresses the research questions formulated in this article by evaluating the effectiveness of ML-based and DL-based techniques in UWA communication.

7.1. RQ1: How Do ML/DL Techniques Improve Channel Estimation in UWA Communication, and What Are the Key System Characteristics and Performance Metrics of These Methods?

ML and DL techniques have revolutionized channel estimation in UWA communication by improving signal prediction, reducing BER, and enhancing communication reliability. Insights drawn from Table 4 and Table 5 show that various ML and DL methods, including LR, DBN, LSTM, and CNN, are applied in both SC-UWA and MC-UWA systems. SC-UWA systems primarily utilize simpler ML models like LR and LSTM, whereas MC-UWA systems employ more complex architectures such as CNN-LSTM hybrids, DNNs, GANs, indicating that multi-carrier systems require advanced learning techniques to manage signal distortions.

The key characteristics of channel estimation techniques, as presented in Table 6 and Table 7, highlight the differences between SC-UWA and MC-UWA systems in terms of bandwidth, transmission distance, modulation schemes, system configurations, subcarriers, and CP. SC-UWA systems typically operate within a bandwidth range of 1–25 kHz and support transmission distances from 0.2 km to 3 km, with modulation schemes mainly limited to BPSK and QPSK. In contrast, MC-UWA systems feature bandwidths extending up to 37.5 kHz, with transmission distances reaching 5 km, utilizing more robust modulation schemes like OFDM, AFDM, and QAM. Additionally, MC-UWA systems integrate subcarriers ranging from 512 to 1024, with CP values between 16 and 256 samples to combat multipath interference.

Table 8 and Table 9 compare training loss, computational complexity, and BER performance across different ML/DL-based channel estimation techniques. Advanced models such as LSTM-based predictors and CNN-enhanced estimators achieve MSE values as low as

10^{- 4}

, indicating superior accuracy in channel estimation. While some models, such as DenseNet-based estimators, demand high computational resources, pruned CNN architectures significantly lower complexity, making them more viable for real-time applications. BER improvements of up to 8 dB at

10^{- 3}

BER demonstrate the effectiveness of AI-driven channel estimation in enhancing communication reliability.

7.2. RQ2: How Do ML/DL Techniques Improve Adaptive Modulation in UWA Communication, and What Are the Key System Characteristics and Performance Metrics of These Methods?

ML and DL techniques have significantly improved adaptive modulation in UWA communication by dynamically adjusting transmission parameters based on real-time conditions. Insights from Table 10 and Table 11 show that various methods, including SVM, RL, DBN, and CNNs, are applied in both single-carrier SC-UWA and MC-UWA systems. SC-UWA systems primarily utilize classification-based ML/DL models such as SVM and CNN, whereas MC-UWA systems employ RL-based models to optimize modulation schemes in response to channel variations.

The key characteristics of adaptive modulation techniques, as presented in Table 12 and Table 13, highlight the differences between SC-UWA and MC-UWA systems in terms of bandwidth, transmission distance, modulation schemes, system configurations, subcarriers, and cyclic prefix. SC-UWA systems typically operate within a 5–10 kHz bandwidth and support transmission distances from 0.82 km to 3 km, with modulation schemes including PSK, QAM, and FSK. In contrast, MC-UWA systems feature bandwidths extending up to 25 kHz, with transmission distances reaching 5 km, utilizing OFDM-based modulation with subcarriers ranging from 32 to 1024 and CP values between 64 and 400 samples to enhance robustness against multipath interference.

Table 14 and Table 15 compare the training loss, computational complexity, throughput, and BER performance across different ML/DL-based adaptive modulation techniques. Advanced models such as RL-based adaptive modulation and CNN-enhanced classifiers achieve classification accuracies exceeding 98%, demonstrating superior adaptability to underwater channel variations. While some models, such as meta-learning-based adaptive modulation, demand higher computational resources, pruned CNN architectures significantly lower complexity, making them more viable for real-time applications. BER improvements of up to 14.8% and throughput gains of 25% indicate the effectiveness of adaptive modulation in enhancing communication reliability.

7.3. RQ3: How Effective Are ML/DL-Driven Modulation Recognition Approaches in Identifying Modulation Schemes Under Complex Underwater Conditions, and What Are Their Strengths and Limitations?

Insights from Table 16, Table 17 and Table 18 highlight the effectiveness of various ML and DL techniques, system configurations, and performance metrics in modulation recognition.

Table 16 provides an overview of the ML and DL techniques used for modulation recognition, including CNNs, ResNet, SVM, DCN, and hybrid architectures. These models use feature extraction, sequence modeling, and contrastive learning to distinguish modulation types under challenging underwater conditions. The training examples used in these studies include both simulated and measured data, ensuring robustness across different environments.

Table 17 presents the key characteristics of modulation recognition systems, including bandwidth, transmission distance, and modulation schemes. The bandwidths range from 1 kHz to 30 kHz, with transmission distances varying from 100 m to 5 km. The modulation schemes considered include PSK, QAM, FSK, DSSS, OFDM, and LFM, demonstrating the versatility of models in recognizing diverse signal types.

Table 18 compares the training loss, computational complexity, classification accuracy, and precision across different ML/DL-based modulation recognition techniques. Advanced models such as CNN-RNN hybrids and transformer-based architectures achieve classification accuracies exceeding 95%, demonstrating superior performance over traditional feature-based methods. While some models, such as ensemble learning-based classifiers, demand higher computational resources, pruned CNN architectures significantly lower complexity, making them more viable for real-time applications. Precision values range from 40% to 100%, indicating the effectiveness of modulation recognition in minimizing false classifications.

7.4. RQ4: What Innovative Approaches and Emerging Trends in Machine/Deep Learning Can Be Employed to Address Unresolved Challenges in Underwater Acoustic Communication, and How Can These Advancements Shape the Future of Intelligent, Efficient, and Scalable UWA Systems?

ML and DL have significantly advanced UWA communication, yet key limitations persist as highlighted in Section 6.1. These include high computational complexity, limited training data, poor generalization across diverse environments, energy constraints on edge devices, fragmented performance metrics, and insufficient robustness under extreme Doppler and multipath conditions. To address these challenges, Section 6.2 outlines several promising directions for future research. First, lightweight and real-time models must be developed for resource-constrained platforms. Physics-Informed Neural Networks offer a pathway to improve generalization by embedding wave propagation dynamics directly into the learning process. Reinforcement learning can significantly enhance adaptive modulation in time-varying underwater channels. Beyond conventional OFDM systems, expanding AI applications to emerging modulation schemes such as index modulation, non-orthogonal multiple access, and generalized frequency division multiplexing can further optimize spectral efficiency. Visual benchmarking strategies, including radar plots and box charts, are also recommended to consolidate performance evaluations across accuracy, BER, latency, and complexity. Lastly, integrating learning-on-the-fly paradigms such as online and incremental learning, alongside RL-based resource allocation, will enable real-time model refinement and dynamic spectrum optimization.

8. Limitations of Research

Although this SLR has been conducted following established guidelines [12] and strictly adhering to the developed review protocol, certain limitations remain:

Search Exhaustiveness: We have selected precise search terms and screened results thoroughly. However, some queries returned thousands of articles, making full review difficult. Several studies were excluded based on titles alone, which may have led to missed relevant work. Therefore, this review does not claim complete exhaustiveness.
Database Selection: This SLR uses four major databases: IEEE, Elsevier, Springer, and Google Scholar. These sources ensure access to high-quality journals. However, some relevant studies from other databases may be missing. This could affect coverage of very recent research. Still, the selected databases offer a strong and representative view of current UWA communication advances.
Scope of Inclusion Criteria: The inclusion criteria were designed to focus on studies published between 2020 and 2025. While this ensures relevance to recent advancements, older foundational works that may still hold significance were excluded. Future research could incorporate a broader time frame to capture historical developments in machine/deep learning applications for underwater acoustic communication.
Generalization of Findings: The findings of this SLR are based on selected studies that met predefined criteria. While efforts were made to ensure a balanced representation of methodologies, there is a possibility that certain niche approaches or emerging techniques were underrepresented. Further studies incorporating additional perspectives may enhance the comprehensiveness of future reviews.

Despite these limitations, we believe that the core findings of this SLR provide valuable insights into the role of machine/deep learning in underwater acoustic communication and serve as a strong foundation for future research.

9. Conclusions

This systematic literature review has provided a structured evaluation of ML and DL techniques in UWA communication, focusing on four key research questions. For RQ1, ML and DL models enhance channel estimation by improving signal prediction and reducing BER. Techniques such as CNNs, LSTMs, and GANs adapt well to underwater variations, achieving MSE values as low as

10^{- 4}

. Regarding RQ2, adaptive modulation powered by ML and DL dynamically optimizes transmission parameters. RL and meta-learning approaches yield BER reductions of up to 14.8% and throughput gains of 25%. For RQ3, ML/DL-driven modulation recognition achieves classification accuracies exceeding 95%. CNN-RNN hybrids, transformers, and contrastive learning models improve signal detection, though precision variability (40–100%) under low-SNR and high-Doppler conditions necessitates refinements. RQ4 highlights future research directions, including the development of lightweight and real-time ML/DL models, the integration of Physics-Informed Neural Networks, and reinforcement learning-based adaptive modulation strategies. Additionally, hybrid deep learning architectures using multi-scale feature fusion, data augmentation, and domain adaptation are promising for enhancing modulation recognition accuracy in noisy and dynamic underwater environments. This review provides a foundation for future AI-driven UWA communication advancements, unlocking new possibilities in deep-sea exploration, naval security, and maritime connectivity.

Funding

This research was funded by Umm Al-Qura University, Saudi Arabia under grant number: 25UQU4320199GSSR01.

Acknowledgments

The authors extend their appreciation to Umm Al-Qura University, Saudi Arabia for funding this research work through grant number: 25UQU4320199GSSR01.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Theocharidis, T.; Kavallieratou, E. Underwater communication technologies: A review. Telecommun. Syst. 2025, 88, 54. [Google Scholar] [CrossRef]
Stojanovic, M.; Preisig, J. Underwater acoustic communication channels: Propagation models and statistical characterization. IEEE Commun. Mag. 2009, 47, 84–89. [Google Scholar] [CrossRef]
Ali, M.F.; Jayakody, D.N.K.; Chursin, Y.A.; Affes, S.; Dmitry, S. Recent advances and future directions on underwater wireless communications. Arch. Comput. Methods Eng. 2020, 27, 1379–1412. [Google Scholar] [CrossRef]
Huang, L.; Wang, Y.; Zhang, Q.; Han, J.; Tan, W.; Tian, Z. Machine Learning for Underwater Acoustic Communications. IEEE Wirel. Commun. 2022, 29, 102–108. [Google Scholar] [CrossRef]
Niu, H.; Li, X.; Zhang, Y.; Xu, J. Advances and applications of machine learning in underwater acoustics. Intell. Mar. Technol. Syst. 2023, 1, 8. [Google Scholar] [CrossRef]
Menaka, D.; Gauni, S.; Manimegalai, C.T.; Kalimuthu, K. Challenges and vision of wireless optical and acoustic communication in underwater environment. Int. J. Commun. Syst. 2022, 35, e5227. [Google Scholar] [CrossRef]
Zia, M.Y.I.; Poncela, J.; Otero, P. State-of-the-art underwater acoustic communication modems: Classifications, analyses and design challenges. Wirel. Pers. Commun. 2021, 116, 1325–1360. [Google Scholar] [CrossRef]
Shwetha, M.; Krishnaveni, S. A systematic analysis, outstanding challenges, and future prospects for routing protocols and machine learning algorithms in underwater wireless acoustic sensor networks. J. Interconnect. Netw. 2025, 25, 2330001. [Google Scholar] [CrossRef]
Wang, B.; Yang, H.; Fang, T. Modulation recognition of underwater acoustic communication signals based on deep learning. EURASIP J. Adv. Signal Process. 2024, 2024, 103. [Google Scholar] [CrossRef]
Liu, H.; Ma, L.; Wang, Z.; Qiao, G. Channel Prediction for Underwater Acoustic Communication: A Review and Performance Evaluation of Algorithms. Remote Sens. 2024, 16, 1546. [Google Scholar] [CrossRef]
Li, Z.; Chitre, M.; Stojanovic, M. Underwater acoustic communications. Nat. Rev. Electr. Eng. 2025, 2, 83–95. [Google Scholar] [CrossRef]
Shaffril, H.A.M.; Samah, A.A.; Samsuddin, S.F. Guidelines for developing a systematic literature review for studies related to climate change adaptation. Environ. Sci. Pollut. Res. 2021, 28, 22265–22277. [Google Scholar] [CrossRef] [PubMed]
Rashid, M.; Imran, M.; Jafri, A.R.; Al-Somani, T.F. Flexible architectures for cryptographic algorithms—A systematic literature review. J. Circuits Syst. Comput. 2019, 28, 1930003. [Google Scholar] [CrossRef]
Rashid, M.; Anwar, M.W.; Khan, A.M. Toward the tools selection in model based system engineering for embedded systems—A systematic literature review. J. Syst. Softw. 2015, 106, 150–163. [Google Scholar] [CrossRef]
Sonbul, O.S.; Rashid, M. Algorithms and techniques for the structural health monitoring of bridges: Systematic literature review. Sensors 2023, 23, 4230. [Google Scholar] [CrossRef] [PubMed]
Sonbul, O.S.; Rashid, M. Towards the structural health monitoring of bridges using wireless sensor networks: A systematic study. Sensors 2023, 23, 8468. [Google Scholar] [CrossRef] [PubMed]
Chen, Y.; Yu, W.; Sun, X.; Wan, L.; Tao, Y.; Xu, X. Environment-aware communication channel quality prediction for underwater acoustic transmissions: A machine learning method. Appl. Acoust. 2021, 181, 108128. [Google Scholar] [CrossRef]
Liu, L.; Cai, L.; Ma, L.; Qiao, G. Channel State Information Prediction for Adaptive Underwater Acoustic Downlink OFDMA System: Deep Neural Networks Based Approach. IEEE Trans. Veh. Technol. 2021, 70, 9063–9076. [Google Scholar] [CrossRef]
Lee-Leon, A.; Yuen, C.; Herremans, D. Underwater acoustic communication receiver using deep belief network. IEEE Trans. Commun. 2021, 69, 3698–3708. [Google Scholar] [CrossRef]
Qiao, G.; Liu, Y.; Zhou, F.; Zhao, Y.; Mazhar, S.; Yang, G. Deep learning-based M-ary spread spectrum communication system in shallow water acoustic channel. Appl. Acoust. 2022, 192, 108742. [Google Scholar] [CrossRef]
Hu, X.; Huo, Y.; Dong, X.; Wu, F.Y.; Huang, A. Channel prediction using adaptive bidirectional GRU for underwater MIMO communications. IEEE Internet Things J. 2023, 11, 3250–3263. [Google Scholar] [CrossRef]
Zhang, Y.; Wang, H.; Li, C.; Chen, X.; Meriaudeau, F. On the performance of deep neural network aided channel estimation for underwater acoustic OFDM communications. Ocean Eng. 2022, 259, 111518. [Google Scholar] [CrossRef]
Zhang, Y.; Li, C.; Wang, H.; Wang, J.; Yang, F.; Meriaudeau, F. Deep learning aided OFDM receiver for underwater acoustic communications. Appl. Acoust. 2022, 187, 108515. [Google Scholar] [CrossRef]
Kim, Y.; Lee, H.; Seol, S.; Chung, J. 2D BiLSTM based channel impulse response estimator for improving throughput in underwater sensor network. IEEE Access 2022, 10, 57227–57233. [Google Scholar] [CrossRef]
Qarabaqi, P.; Stojanovic, M. Statistical characterization and computationally efficient modeling of a class of underwater acoustic communication channels. IEEE J. Ocean Eng. 2013, 38, 701–717. [Google Scholar] [CrossRef]
van Walree, P.A.; Socheleau, F.X.; Otnes, R.; Jenserud, T. The watermark benchmark for underwater acoustic modulation schemes. IEEE J. Ocean Eng. 2017, 42, 1007–1018. [Google Scholar] [CrossRef]
Kapileswar, N.; Phani Kumar, P. Optimized deep learning driven signal detection and adaptive channel estimation in underwater acoustic IoT networks. Int. J. Commun. Syst. 2024, 37, e5673. [Google Scholar] [CrossRef]
Feng, X.; Zhou, M.; Wang, J.; Sun, H.; Pan, G.; Wen, M. Model-driven deep learning-based estimation for underwater acoustic channels with uncertain sparsity. IEEE Trans. Wirel. Commun. 2023, 23, 5710–5725. [Google Scholar] [CrossRef]
Liu, S.; Yan, H.; Ma, L.; Liu, Y.; Han, X. UACC-GAN: A Stochastic Channel Simulator for Underwater Acoustic Communication. IEEE J. Ocean Eng. 2024, 49, 1605–1621. [Google Scholar] [CrossRef]
Liu, S.; Adil, M.; Ma, L.; Mazhar, S.; Qiao, G. DenseNet-Based Robust Channel Estimation in OFDM for Improving Underwater Acoustic Communication. IEEE J. Ocean Eng. 2025, 50, 1518–1537. [Google Scholar] [CrossRef]
Cui, X.; Zhang, C.; Li, J.; Jiang, B.; Li, S.; Liu, J. Deep Learning Model-Driven Channel Estimation and Equalization for Underwater Acoustic OFDM Receivers. Internet Technol. Lett. 2025, 8, e619. [Google Scholar] [CrossRef]
Tian, T.; Raj, A.; Xavier, B.M.; Zhang, Y.; Wu, F.Y.; Yang, K. A Multi-Task Learning Framework for Underwater Acoustic Channel Prediction: Performance Analysis on Real-World Data. IEEE Trans. Wirel. Commun. 2024, 23, 15930–15944. [Google Scholar] [CrossRef]
Huang, P.; Li, Q.; Huang, D.; Wang, J. Channel estimation and symbol detection for AFDM over doubly selective fading channels. Phys. Commun. 2025, 69, 102597. [Google Scholar] [CrossRef]
Zhang, Y.; Wang, Y.; Liu, Y.; Shi, L.; Zang, Y. A Deep Learning Receiver for Underwater Acoustic OTFS Communications with Doppler Squint Effect. IEEE Wirel. Commun. Lett. 2025, 14, 1179–1183. [Google Scholar] [CrossRef]
Huang, L.; Zhang, Q.; Tan, W.; Wang, Y.; Zhang, L.; He, C.; Tian, Z. Adaptive modulation and coding in underwater acoustic communications: A machine learning perspective. EURASIP J. Wirel. Commun. Netw. 2020, 2020, 203. [Google Scholar] [CrossRef]
Byun, J.; Cho, Y.H.; Im, T.; Ko, H.L.; Shin, K.; Kim, J.; Jo, O. Iterative learning for reliable link adaptation in the Internet of Underwater Things. IEEE Access 2021, 9, 30408–30416. [Google Scholar] [CrossRef]
Cui, X.; Zhang, Z.; Li, J.; Jiang, B.; Li, S.; Liu, J. Reinforcement learning-based adaptive modulation scheme over underwater acoustic OFDM communication channels. Phys. Commun. 2023, 61, 102207. [Google Scholar] [CrossRef]
Anitha, D.; Karthika, R. Hybrid deep learning-based adaptive multiple access schemes underwater wireless networks. Intell. Autom. Soft Comput. 2023, 35, 2463–2477. [Google Scholar] [CrossRef]
Cui, X.; Yan, P.; Li, J.; Li, S.; Liu, J. Deep reinforcement learning-based adaptive modulation for OFDM underwater acoustic communication system. EURASIP J. Adv. Signal Process. 2023, 2023, 1. [Google Scholar] [CrossRef]
Jing, L.; Dong, C.; He, C.; Shi, W.; Wang, H.; Zhou, Y. Adaptive Modulation and Coding for Underwater Acoustic OTFS Communications Based on Meta-Learning. IEEE Commun. Lett. 2024, 28, 1845–1849. [Google Scholar] [CrossRef]
Sweta, T.; Ruthrapriya, S.; Sneka, J.; Alex, J.S.R.; Rohith, G.; Das, M. Reinforcement learning-based automated modulation switching algorithm for an enhanced underwater acoustic communication. Results Eng. 2024, 23, 102791. [Google Scholar] [CrossRef]
Qiu, Y.; Yang, X.; Tong, F.; Chen, D. Evaluation of Reinforcement Learning-Based Adaptive Modulation in Shallow Sea Acoustic Communication. J. Mar. Sci. Appl. 2025, 1–8. [Google Scholar] [CrossRef]
Alamgir, M.; Sultana, M.N.; Chang, K. Link adaptation on an underwater communications network using machine learning algorithms: Boosted regression tree approach. IEEE Access 2020, 8, 73957–73971. [Google Scholar] [CrossRef]
Jiang, Z.; Zhang, J.; Wang, T.; Wang, H. Modulation recognition of underwater acoustic communication signals based on neural architecture search. Appl. Acoust. 2024, 225, 110155. [Google Scholar] [CrossRef]
Yang, X.; Wang, Z.; Shen, T.; Zhao, D. Modulation Classification of Underwater Communication Signals Based on Channel Estimation. J. Mar. Sci. Eng. 2024, 12, 1877. [Google Scholar] [CrossRef]
Wang, Y.; Shen, T.; Wang, T.; Qiao, G.; Zhou, F. Modulation recognition for underwater acoustic communication based on hybrid neural network and feature fusion. Appl. Acoust. 2024, 225, 110185. [Google Scholar] [CrossRef]
Wang, J.; Huang, Z.; Shi, W.; Mao, S. One2ThreeNet: An automatic microscale-based modulation recognition method for underwater acoustic communication systems. IEEE Trans. Wirel. Commun. 2024, 23, 10287–10300. [Google Scholar] [CrossRef]
Li, J.; Jia, Q.; Cui, X.; Gulliver, T.A.; Jiang, B.; Li, S.; Yang, J. Automatic modulation recognition of underwater acoustic signals using a two-stream transformer. IEEE Internet Things J. 2024, 11, 18839–18851. [Google Scholar] [CrossRef]
Wang, Y.; Jin, Y.; Zhang, H.; Lu, Q.; Cao, C.; Sang, Z.; Sun, M. Underwater communication signal recognition using sequence convolutional network. IEEE Access 2021, 9, 46886–46899. [Google Scholar] [CrossRef]
Huang, Z.; Li, S.; Yang, X.; Wang, J. OAE-EEKNN: An accurate and efficient automatic modulation recognition method for underwater acoustic signals. IEEE Signal Process. Lett. 2022, 29, 518–522. [Google Scholar] [CrossRef]
Zhang, W.; Yang, X.; Leng, C.; Wang, J.; Mao, S. Modulation recognition of underwater acoustic signals using deep hybrid neural networks. IEEE Trans. Wirel. Commun. 2022, 21, 5977–5988. [Google Scholar] [CrossRef]
Gao, D.; Hua, W.; Su, W.; Xu, Z.; Chen, K. Supervised contrastive learning-based modulation classification of underwater acoustic communication. Wirel. Commun. Mob. Comput. 2022, 2022, 3995331. [Google Scholar] [CrossRef]
Zhang, R.; He, C.; Jing, L.; Zhou, C.; Long, C.; Li, J. A modulation recognition system for underwater acoustic communication signals based on higher-order cumulants and deep learning. J. Mar. Sci. Eng. 2023, 11, 1632. [Google Scholar] [CrossRef]
Wang, X.; Tu, Y.; Liu, J.; Han, G.; Yu, C.; Cui, J.H. Edge-Enabled Modulation Classification in Internet of Underwater Things Based on Network Pruning and Ensemble Learning. IEEE Internet Things J. 2023, 11, 13608–13621. [Google Scholar] [CrossRef]
Yao, X.; Yang, H.; Sheng, M. Automatic modulation classification for underwater acoustic communication signals based on deep complex networks. Entropy 2023, 25, 318. [Google Scholar] [CrossRef] [PubMed]
Guerrero-Chilabert, G.S.; Moreno-Salinas, D.; Sánchez-Moreno, J. Design and Development of an SVM-Powered Underwater Acoustic Modem. J. Mar. Sci. Eng. 2024, 12, 773. [Google Scholar] [CrossRef]
Zhu, Z.; Tong, F.; Zhou, Y.; Zhang, Z.; Zhang, F. Deep learning prediction of time-varying underwater acoustic channel based on LSTM with attention mechanism. J. Mar. Sci. Appl. 2023, 22, 650–658. [Google Scholar] [CrossRef]
Li, X.; Han, Z.; Yu, H.; Yan, L.; Han, S. Deep Learning for OFDM Channel Estimation in Impulsive Noise Environments. Wirel. Pers. Commun. 2022, 125, 2947–2964. [Google Scholar] [CrossRef]
Zhang, Y.; Chang, J.; Liu, Y.; Xing, L.; Shen, X. Deep learning and expert knowledge based underwater acoustic OFDM receiver. Phys. Commun. 2023, 58, 102041. [Google Scholar] [CrossRef]
Guo, J.; Guo, T.; Li, M.; Wu, T.; Lin, H. Underwater-Acoustic-OFDM Channel Estimation Based on Deep Learning and Data Augmentation. Electronics 2024, 13, 689. [Google Scholar] [CrossRef]
Wand, M.; Kristoffersen, M.B.; Franzke, A.W.; Schmidhuber, J. Analysis of neural network based proportional myoelectric hand prosthesis control. IEEE Trans. Biomed. Eng. 2022, 69, 2283–2293. [Google Scholar] [CrossRef] [PubMed]
Willmott, C.J.; Matsuura, K. Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Clim. Res. 2005, 30, 79–82. [Google Scholar] [CrossRef]
Arif, M.; Rashid, M. A Literature Review on Model Conversion, Inference, and Learning Strategies in EdgeML with TinyML Deployment. Comput. Mater. Contin. 2025, 83, 13–64. [Google Scholar] [CrossRef]

Figure 1. Overview of the SLR: from the article selection to comparative analysis.

Figure 2. A typical UWA OFDM (a) transmitter and (b) receiver.

Figure 3. A typical UWA transceiver using ML/DL for adaptive modulation.

Figure 4. A typical UWA transceiver using ML/DL for modulation recognition.

Figure 5. Stepwise selection and screening procedure for research article inclusion.

Figure 6. Year-Wise Distribution of Selected Research Articles from WoS-Indexed Journals (2020–2025), Illustrating Publication Trends and the Evolving Focus on ML/DL Applications in UWA Communication.

Table 1. Summary of existing reviews on ML/DL techniques in UWA communication.

Ref. (Year)	Key Contributions	Limitations
[4] (2022)	Targets adaptive modulation at the physical layer, and provides a taxonomy of ML algorithms while discussing their potential to address UWA challenges.	Does not evaluate ML and DL techniques for channel estimation or modulation recognition, and does not provide performance metrics for adaptive modulation.
[4] (2023)	Highlights the potential of ML and DL in addressing dynamic underwater environments, with a focus on adaptive modulation, channel prediction, and demodulation.	Lacks a detailed comparative analysis of ML and DL algorithms, including their system characteristics and performance metrics for key UWA challenges.
[5] (2023)	Focuses on source localization, target recognition, and geoacoustic inversion, while providing an evaluation of key techniques, datasets, and ML/DL models.	Lacks emphasis on ML and DL techniques for addressing UWA challenges such as channel estimation, adaptive modulation, and modulation recognition.
[10] (2024)	Analyzes UWA channel prediction techniques, categorizing them into linear, kernel-based, and deep learning approaches, with evaluations of performance and complexity.	Lacks investigations and the impact of adaptive modulation and modulation recognition on enhancing the efficiency and reliability of UWA communication.
[11] (2025)	Focuses on channel modeling, signal processing techniques, and network protocols, while suggesting future directions like standardized models and data-driven solutions.	Lacks exploration of ML and DL applications for specific UWA challenges, such as channel estimation, adaptive modulation, and modulation recognition.
[1] (2025)	Investigates DL techniques for modulation recognition in UWA communication, proposing a hybrid model with multi-scale feature fusion.	Does not explore the other key components such as channel estimation and adaptive modulation.
[8] (2025)	Analyzes routing protocols and ML algorithms, highlighting benefits, challenges, future prospects, and providing a detailed taxonomy and performance evaluation.	Does not explore channel estimation, adaptive modulation, and modulation recognition in terms of their key system characteristics and performance metrics.

Table 2. Search results for ML and DL techniques in UWA communication (2020–2025).

Search Term	IEEE	Elsevier	Springer
Underwater Acoustic	5189	8422	6187
Underwater Acoustic Machine Learning	346	1885	1264
Underwater Acoustic Deep Learning	601	2001	1350
Underwater Acoustic Channel Estimation Machine Learning	24	586	296
Underwater Acoustic Channel Estimation Deep Learning	60	649	342
Underwater Acoustic Channel Prediction Machine Learning	23	672	326
Underwater Acoustic Channel Prediction Deep Learning	19	744	363
Underwater Acoustic Receiver Machine Learning	36	526	359
Underwater Acoustic Receiver Deep Learning	55	561	373
Underwater Acoustic Modulation Classification	52	419	299
Underwater Acoustic Modulation Classification	52	419	299
Underwater Acoustic Modem Machine Learning	6	107	43
Underwater Acoustic Modulation Machine Learning	42	382	243
Underwater Acoustic Modulation Deep Learning	76	407	276
Underwater Acoustic Modulation Recognition Machine Learning	16	199	128
Underwater Acoustic Modulation Recognition Deep Learning	28	222	147
Total Articles from Databases	6983	18,882	12,396
Sum of All Articles	38,261
Additional Google Scholar Articles	3000
Final Grand Total (All Sources)	41,261

Table 3. Systematic process for extracting, analyzing, and classifying research studies in ML/DL-based UWA communication: outlining key data collection methods, evaluation criteria, and classification approaches.

No.	Item	Corresponding Details
1	Citation Data	It includes the title, author(s), publication year, publisher, and research type (journal or conference).
2	Overview	A concise summary outlining the fundamental proposal and primary objective of the research
3	Results	Findings obtained from the analyzed research, highlighting key insights and conclusions
4	Data Collection	Specifies whether the study employs quantitative or qualitative data gathering methods
5	Assumptions	Identifies any underlying assumptions made to support and validate the research findings
6	Validation	Describes the methodology used to verify the accuracy and reliability of the proposed study
7	Channel	Overview: Table 4 and Table 5
	Estimation	Characteristics: Table 6 and Table 7
	Techniques	Comparison: Table 8 and Table 9
8	Adaptive	Overview: Table 10 and Table 11
	Modulation	Characteristics: Table 12 and Table 13
	Techniques	Comparison: Table 14 and Table 15
9	Modulation	Overview: Table 16, Characteristics: Table 17
	Recognition	Comparison: Table 18

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Toward Intelligent Underwater Acoustic Systems: Systematic Insights into Channel Estimation and Modulation Methods

Abstract

1. Introduction

1.1. Motivation for a Systematic Literature Review

1.2. State-of-the-Art Review Articles and Their Limitations

1.3. Research Questions

1.4. SLR Framework

2. Literature Review Methodology

2.1. Detailed Exploration of Category Backgrounds

2.1.1. Channel Estimation

2.1.2. Adaptive Modulation

2.1.3. Modulation Recognition

2.2. Development of the Review Protocol

2.2.1. Criteria for Selection and Rejection

2.2.2. Literature Search Process

2.2.3. Systematic Approach Used in Extracting and Analyzing Studies

3. Results on ML/DL-Based Channel Estimation

3.1. Overview of Channel Estimation Approaches

3.2. Key Characteristics of Channel Estimation Techniques

3.3. Comparative Analysis of Channel Estimation Techniques

3.4. Discussion on ML/DL-Based Channel Estimation in SC-UWA Communication Systems

3.5. Discussion on ML/DL-Based Channel Estimation in MC-UWA Communication Systems

4. Results on ML/DL-Based Adaptive Modulation

4.1. Overview of Adaptive Modulation Strategies

4.2. Key Characteristics of Adaptive Modulation Techniques

4.3. Comparative Analysis of Adaptive Modulation Techniques

4.4. Discussion on Adaptive Modulation Techniques in SC-UWA Communication Systems

4.5. Discussion on Adaptive Modulation Techniques in MC-UWA Communication Systems

5. Results on ML/DL-Based Modulation Recognition

5.1. Overview of Modulation Recognition Techniques

5.2. Key Characteristics of Modulation Recognition Techniques

5.3. Performance Comparison of Modulation Recognition Techniques

5.4. Discussion on ML/DL-Based MR Techniques in UWA Communication Systems

6. Challenges and Future Research Directions

6.1. Challenges in ML/DL-Based UWA Communication

6.2. Future Research Directions

7. Responses to Formulated Research Questions

7.1. RQ1: How Do ML/DL Techniques Improve Channel Estimation in UWA Communication, and What Are the Key System Characteristics and Performance Metrics of These Methods?

7.2. RQ2: How Do ML/DL Techniques Improve Adaptive Modulation in UWA Communication, and What Are the Key System Characteristics and Performance Metrics of These Methods?

7.3. RQ3: How Effective Are ML/DL-Driven Modulation Recognition Approaches in Identifying Modulation Schemes Under Complex Underwater Conditions, and What Are Their Strengths and Limitations?

7.4. RQ4: What Innovative Approaches and Emerging Trends in Machine/Deep Learning Can Be Employed to Address Unresolved Challenges in Underwater Acoustic Communication, and How Can These Advancements Shape the Future of Intelligent, Efficient, and Scalable UWA Systems?

8. Limitations of Research

9. Conclusions

Funding

Acknowledgments

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics