Next Article in Journal
Compact Bio-Inspired Terahertz Ultrawideband Antenna: A Viburnum tinus-Based Approach for 6G and Beyond Applications
Previous Article in Journal
Neural Interfaces for Robotics and Prosthetics: Current Trends
Previous Article in Special Issue
Development of Optical and Electrical Sensors for Non-Invasive Monitoring of Plant Water Status
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Improving Audio Steganography Transmission over Various Wireless Channels

by
Azhar A. Hamdi
1,
Asmaa A. Eyssa
1,2,
Mahmoud I. Abdalla
1,
Mohammed ElAffendi
3,
Ali Abdullah S. AlQahtani
4,
Abdelhamied A. Ateya
1,3,* and
Rania A. Elsayed
1
1
Department of Electronics and Communications Engineering, Zagazig University, Zagazig 44519, Egypt
2
Department of Electronics and Communications, Zagazig Higher Institute of Engineering and Technology, Zagazig 44519, Egypt
3
EIAS Data Science Lab, College of Computer and Information Sciences, Prince Sultan University, Riyadh 11586, Saudi Arabia
4
Software Engineering Department, Prince Sultan University, Riyadh 11586, Saudi Arabia
*
Author to whom correspondence should be addressed.
J. Sens. Actuator Netw. 2025, 14(6), 106; https://doi.org/10.3390/jsan14060106
Submission received: 17 July 2025 / Revised: 25 August 2025 / Accepted: 29 August 2025 / Published: 30 October 2025

Abstract

Ensuring the security and privacy of confidential data during transmission is a critical challenge, necessitating advanced techniques to protect against unwarranted disclosures. Steganography, a concealment technique, enables secret information to be embedded in seemingly harmless carriers such as images, audio, and video. This work proposes two secure audio steganography models based on the least significant bit (LSB) and discrete wavelet transform (DWT) techniques for concealing different types of multimedia data (i.e., text, image, and audio) in audio files, representing an enhancement of current research that tends to focus on embedding a single type of multimedia data. The first model (secured model (1)) focuses on high embedding capacity, while the second model (secured model (2)) focuses on improved security. The performance of the two proposed secure models was tested under various conditions. The models’ robustness was greatly enhanced using convolutional encoding with binary phase shift keying (BPSK). Experimental results indicated that the correlation coefficient (Cr) of the extracted secret audio in secured model (1) increased by 18.88% and by 16.18% in secured model (2) compared to existing methods. In addition, the Cr of the extracted secret image in secured model (1) was improved by 0.1% compared to existing methods. The peak signal-to-noise ratio (PSNR) of the steganography audio of secured model (1) was improved by 49.95% and 14.44% compared to secured model (2) and previous work, respectively. Furthermore, both models were evaluated in an orthogonal frequency division multiplexing (OFDM) system over various wireless channels, i.e., Additive White Gaussian Noise (AWGN), fading, and SUI-6 channels. In order to enhance the system performance, OFDM was combined with differential phase shift keying (DPSK) modulation and convolutional coding. The results demonstrate that secured model (1) is highly immune to noise generated by wireless channels and is the optimum technique for secure audio steganography on noisy communication channels.

1. Introduction

Due to the need to protect data from intruders, various methods have emerged, such as encryption and watermarking technologies. As a result of attacks on these technologies, the need arose to develop another type of data protection known as steganography. Steganography is considered a secure format in the communication process, and multiple methods have emerged for hiding confidential data, such as image and audio steganography [1]. Steganography is considered a robust protection technique, hiding confidential data in any cover format, be it an image, sound, or audio. Meanwhile, the sender and receiver have the ability to know whether confidential data exists [2].
The design of a steganography technique uses the properties of the cover to embed data inside it. Because of the sensitivity, capacity, and widespread availability of audio files, they are considered the best cover media for embedding secret data [3]. There are three main categories of audio steganography. The first category converts secret messages into sequences of binary bits and embeds them in the least significant bit (LSB) of the audio cover. In this technique, a high capacity of secret data can be embedded, but channel noise can be a cause of data detection [1,2,3,4]. The second is transforming domain techniques, i.e., making the low and high frequencies inaudible so they are near each other. The third category is the wavelet domain technique, i.e., discrete wavelet transform (DWT), which depends on hiding the message in the LSBs of the integer wavelet coefficients [1,5]. Although the phase coding and spread spectrum techniques are considered classic techniques for audio steganography, in cases where there is no noise in the cover audio, robust steganography can be achieved [6].
Orthogonal frequency-division multiplexing (OFDM) is a type of digital transmission used in digital modulation for encoding digital, i.e., binary, data on multiple carrier frequencies, since it is a form of multicarrier transmission. The idea of OFDM is to use parallel data streams. OFDM has been studied on high-speed modems, digital mobile communications, and high-density recording [7]. OFDM is utilized by several wireless multimedia transmission systems, such as digital audio broadcasting (DAB), digital video broadcasting (DVB), digital image broadcasting (DIB), and wireless local area networks (WLAN), because of its resistance to multipath fading and impulsive noise. Overlapping the spectra of subcarriers is utilized in OFDM, which is considered better than a conventional single-carrier system. As a result of this advantage, the reduction of the influence of inter-symbol interference (ISI) is achieved by the small bit rate per sub-carrier [8].
Multipath propagation occurs when a signal travels from the transmitter to the receiver over multiple reflective paths in a wireless mobile communication system. Due to multipath propagation, the received signal’s amplitude, phase, and angle are changed [9]. The propagation medium also experiences physical changes, such as the electron density of the ionosphere layers being varied so that high-frequency (HF) radio signals are reflected. The effect of various wireless channels is applied to two secured models under different conditions [10]. For example, the Additive White Gaussian Noise (AWGN) channel is considered statistically to be random noise in a wide frequency range, e.g., very low frequency up to 1012 Hz, with constant spectral density.
The IEEE 802.16 working group adopted the Stanford University Interim (SUI) channel models, including SUI-6, for simulating and evaluating fixed wireless broadband systems. The SUI channel models comprise six empirical, time-dispersive models that simulate wireless propagation environments by representing the channel as a finite set of scatters. Each scatter is modeled using a finite impulse response (FIR) filter characterized by specific delay and scaling parameters. These six models are based on three representative terrain types, i.e., A, B, and C, each reflecting distinct environmental and topographical features. Terrain type A represents hilly regions with moderate to dense foliage, typically resulting in high path loss. Terrain type B corresponds to either hilly areas with sparse vegetation or flat terrains with moderate tree density. Terrain type C is associated with flat terrain and low vegetation density, leading to relatively lower path loss. Among the SUI models, the SUI-6 channel corresponds to Terrain type A and is characterized by both high delay spread and high Doppler spread, making it suitable for evaluating system performance under challenging propagation conditions [11,12].

1.1. Problem Statement

Most previous research focused on developing audio steganography techniques without knowing the effect of wireless communication channels on them. As such, reconstructed signals from transmitted audio steganography would be susceptible to noise [13,14]. There are many problems that affect the transmission of audio steganography over wireless channels, including the following:
  • In previous works, audio steganography techniques concealed multimedia data (text, images, or audio) in a cover audio file, so the capacity of the embedded data was low.
  • In previous studies, researchers designed different audio steganography techniques without testing them in communication systems and without studying the effect of wireless channel noise.
  • The quality of reconstructed data from transmitting audio steganography suffered from the noise resulting from wireless channels.
  • Although OFDM is considered a wideband modulation method for coping with multi-path channel problems, the reconstructed multimedia data are affected by it.
  • In previous studies, OFDM systems did not address the effect of different wireless channels (AWGN, Fading, and SUI) on audio steganography techniques.

1.2. Main Contributions

The main contributions of this paper are as follows.
  • Constructing two models to hide various secret data in an audio cover file. The first model is based on the LSB technique, and the second is based on DWT and the LSB techniques to increase the capacity of the embedded data.
  • Analyzing the behavior of the two proposed models using BPSK modulation over an AWGN wireless channel.
  • Applying a free error control scheme, known as a convolutional (1, 2, 7) encoding scheme, to transmit audio steganography reconstructed from the two models to cope with the noise coming from wireless channels.
  • Merging OFDM and differential pulse shift keying (DPSK) modulations over an AWGN wireless channel in order to enhance the OFDM system.
  • Studying the behavior of the two proposed secure models through enhanced OFDM over various wireless channels, such as AWGN, multipath fading, and SUI-6.
The remainder of this paper is structured as follows: Contributions of other researchers to the steganography field and a description of the audio steganography technique are presented in Section 2. Section 3 describes the proposed system for audio steganography. A performance analysis of the two models over the AWGN channel using encoding methods is presented in Section 4. Section 5 presents an evaluation of the performance of the two proposed models over the OFDM system on various wireless channels. Also, it provides a comparative performance analysis concerning other designs presented in the literature. 6 outlines the most important conclusions and future work avenues.

2. Background and Related Work

2.1. Description of Audio Steganography Technique

Audio steganography is a technique used to conceal information within an audio signal so that the hidden data are imperceptible to the human ear. Unlike cryptography, which provides data protection using hidden messages, steganography tackles hiding the very existence of the message [15]. Hiding pieces of information requires inserting metadata into a digital audio file or modifying the LSBs of specific samples. More sophisticated methods include, but are not limited to, phase coding, echo hiding, and spread spectrum techniques. No matter the method used, the audio quality must remain the same. Because of this, audio steganography is highly regarded in secure communication, digital rights management, and watermarking [16]. Figure 1 describes the audio steganography technique.
The steganography (stego) audio is constructed by embedding secret multimedia data in a cover audio file. The embedding process is based on utilizing the properties of the cover file. There are two categories of embedding: the spatial domain, such as the LSB technique, and the transforming domain, such as DWT, DCT, and DFT techniques. The embedding process based on the transformation domain depends on modifying transform coefficients based on the embedded data. The spatial domain technique is considered to have a higher density than the transform domain [1].

2.2. Related Works

Audio steganography techniques have gained the attention of most researchers in the steganography field due to their superiority in protecting confidential data; therefore, various research has emerged. This section summarizes recent studies that have proposed audio steganography techniques.
Voice over Internet Protocol (VoIP) is a widely used technology for real-time voice communication and is widely used applications such as Skype, WhatsApp, and Google Talk. However, VoIP communications are vulnerable to hacking and eavesdropping, as they pass through public internet channels. Private channels afford high security and encryption levels but at the expense of device constraints and geographic limitations. To address these challenges, B. Q. A. Ali et al. [17] proposed a real-time hidden communication system that conceals a confidential speech channel within an open VoIP channel using an audio steganography method. The confidential speech is compressed within the system using an Internet low-bit-rate codec to ensure high embedding capacity without degrading the speech quality, even over lossy networks. Concurrently, the cover speech is compressed by a G.711 encoder in synchronized 20-ms frames to comply with VoIP requirements. The embedding process exploits robust data components selectively to ensure the quality of the cover signal, which possesses a signal-to-noise ratio (SNR) of more than 40 dB. Experimental outcomes indicated that secret data could be extracted perfectly (100% accuracy) under lossless channels. This method is a hopeful solution to achieve secure, stealthy voice communication in sensitive operations such as military and government applications.
M. Helmy [6] suggested a hybrid technique grounded in audio steganography and multi-layer cryptography to formulate an end-to-end crypto-steganography system. In that case, different encryption schemes encrypt multiple audio signals, while users employ specific keys to decrypt and encrypt. On the other hand, concealed messages are hidden inside cover audio files by steganography without altering the perceptual structure of the audio, allowing secret transmission. The proposed system further enhances security by utilizing multiple layers of encryption and transmitting the encrypted audio data via OFDM systems, i.e., FFT-OFDM, DCT-OFDM, and DWT-OFDM, in a WiMAX scenario. Experimental observations confirmed that the proposed hybrid crypto-steganography method effectively resists attack, guarantees secure transmission, and reconstructs high-quality audio at the receiver side.
Different steganography techniques have been developed to support the secure transmission of secret information across insecure and open communication channels. Audio steganography, however, uses audio signals as carriers to conceal secret information imperceptibly. In [18], the authors proposed a new audio steganography technique to enhance security, capacity for hiding, and transparency of the communication of speech messages using contourlet transform in conjunction with a duffing oscillator. In this approach, the contourlet transform spreads data coefficients that are broader than the original samples to create desirable embedding areas for secret speech samples. For further security improvement, the duffing oscillator mixes cover audio samples before embedding, thereby increasing robustness against illegal extraction and piracy. The duffing oscillator is used to adequately improve the stability of the embedding procedure while maintaining audio transparency. Experimental outcomes confirmed that the technique has a hiding ability of as much as 80% of the size of the cover audio without compromising the audio quality and ensuring transparent, secure communication. The outcome indicated that the technique has significant potential as a candidate for secure audio-based data hiding schemes.
M. M. Mahmoud and H. T. Elshoush [19] suggested an advanced audio steganography technique known as LSB-BMSE to enhance traditional LSB embedding techniques. It applies a new binary message size encoding (BMSE) mechanism, which first encodes and hides the secret message size in random audio samples before the message content is embedded. The ciphertext message is initially compressed by Huffman coding and subsequently encrypted with AES-128 to ensure confidentiality. The audio cover is divided into blocks based on the message length, and a secure key generated using BMSE is used to dynamically embed the encrypted message into random blocks and bytes. The proposed method was extensively tested in terms of the perceptual evaluation of speech quality and against the NIST statistical test suite for imperceptibility and randomness checks, respectively. Additional fidelity measures, including mean square error (MSE), peak SNR (PSNR), and SNR, supported the superiority of the stego audio. Experimental results confirmed that LSB-BMSE operated much better than existing methods regarding hiding capacity and sound transparency. The technique was also found to be highly resistant to brute force and statistical attacks and robust against resampling attacks. It was even only partially vulnerable to direct LSB and noise attacks.
P. K. Kasetty and A. Kanhe [20] proposed a hidden speech communication technique based on a semi-blind audio steganography scheme utilizing a combination of DWT and singular value decomposition (SVD). In this technique, the singular values of the hidden speech were embedded within a cover audio file for secure transmission. The cover audio was decomposed at a fifth-level DWT, and embedding was performed in the singular value matrix of the approximate coefficients to enhance both robustness against various attacks and imperceptibility. The method was tested on various speech signals by performing thorough imperceptibility and robustness tests. Experimental outcomes revealed that stego speech could be successfully retrieved under high SNR, high correlation, and ideal perceptual quality, even after the stego audio had undergone different types of attacks. Under an attack-free condition, the normalized correlation coefficient was 1.0, the SNR of the reconstructed audio was 70.58, and the PESQ score was 4.49.
S. T. Abdulrazzaq et al. [21] proposed a secure steganography method designed to conceal image files within audio signals to enhance the potential of audio carriers. The procedure utilized plain WAV audio files as the cover media using LSB substitution. At the same time, the images were first compressed and encrypted using the GMPR algorithm based on DCT and high-frequency minimization encoding. After encryption and compression, the image bits were embedded into the audio data using controlled bit replacement. In contrast to conventional LSB-based methods that typically replace a maximum of 6 LSBs to avoid audible distortion, the suggested approach employed multiple and variable LSB layers to embed encrypted image information in a more effective way without compromising audio quality. Tests found no detectable difference between the stego and original audio signals, confirming the method’s imperceptibility. Further performance evaluation based on SNR and PSNR measures attested to the method’s ability to maintain high audio quality while providing secure and high-capacity data embedding.
Recent studies have utilized memristor dynamic resistance states to generate pseudorandom sequences for encryption applications. Suo Gao et al. [22] proposed a three-dimensional hyperchaotic map based on a memristor (3D-HMBM) using sine-function nonlinearity and discrete modeling of a memristor. The system was rich in dynamical activity, transitioned from periodicity to chaos and hyperchaos, and exhibited infinite coexisting attractors. Its complexity was established using Lyapunov exponents, spectral entropy, and C0 complexity measures, and hardware simulations determined pragmatic feasibility. With these attributes, a multi-image encryption algorithm using the 3D-HMBM was proposed, which performed superbly in statistical and cryptographic tests, indicating its potential for secure data encryption on a large scale.
Discrete chaotic systems using memristors are preferred for cryptography since they possess rich dynamics and good hardware efficiency. Beyond standalone memristor devices, the authors of [23] proposed the 3D memristive cubic map with double discrete memristors (3D-MCM) with greater complexity and coexisting attractors, as established by bifurcation analysis, Lyapunov spectra, and hardware implementation. To demonstrate its cryptography potential, the system was coupled with a quaternary-based permutation and dynamic emanating diffusion (QPDED-IE) algorithm for image encryption. Experimental results confirmed strong confusion–diffusion features and immunity to cryptanalytic attacks, demonstrating the application potential of dual-memristor chaotic models for secure encryption.
Audio steganography allows for hiding confidential information within audio signals; the majority of approaches in this field fall under the category of LSB-based techniques, transform domain techniques, phase coding, echo hiding, and spread spectrum. All of these methods have limitations such as vulnerability to attacks, low embedding capacity, high computational complexity, potential degradation of audio quality, and compatibility issues. To solve these problems, S. S. Saranya et al. [24] proposed an audio steganography system using MATLAB with a combination of RC7 encryption and chaotic decryption. As per this approach, the confidential message was first encrypted using the RC7 algorithm and then embedded within the binary content of an audio file. The upgraded binary content was converted back into an audio signal, which could further be processed to extract and decrypt the concealed message using chaotic decryption techniques. Results indicated that the method is feasible and well-balanced in terms of security and audio quality, demonstrating its potential for secure audio-based data hiding purposes.
A. Singha and M. A. Ullah [25] proposed a secure digital watermarking technique to enhance the security of audio signals based on DWT and SVD. The proposed method used multi-level DWT and multiple watermark images of various sizes, which were strategically embedded in the host audio signal. This distributed embedding ensured that the watermark energy was evenly distributed, improving attack resistance and preserving audio quality. The method enhanced imperceptibility and robustness by spreading the watermark data across the host signal. Experimental results demonstrated that the scheme achieved a very high PSNR of 196.81 dB and a normalized cross-correlation (NCC) value above 0.9, demonstrating its effectiveness in maintaining audio fidelity and satisfactory watermark recovery.
Table 1 compares existing audio steganography techniques with those proposed in this paper to better introduce the novelty of the proposed method.

3. The Proposed Models of Audio Steganography

The two proposed models aim to conceal secret multimedia data (text, image, and audio) in an audio cover to yield high capacity and high security. These two models enable the embedding of different types of audio with different bit rates and sampling rates. The two models are based on two different categories of domains. The first domain category is the spatial domain, LSB, based on embedding secret data in 8-LSB of cover audio files; this was the idea of Model 1. The second domain category is the transforming domain (DWT), based on dealing with coefficients of this transformation domain; the embedding of secret data in the LSB of DWT coefficients was the idea of the constructed Model 2.

3.1. The First Proposed Algorithm

The idea of the first model of audio steganography is to utilize the LSB technique. A secret image of size 256 × 256 with text and secret audio were hidden in 8-LSB of a cover audio file. Figure 2 presents the structure of the proposed model (Model 1). The following steps were performed to embed secret multimedia data in the audio cover file.
  • To prepare secret images with secret text, DWT transform must be applied to the images. Then, four bands will appear (LL, LH, HH, HL). After that, the text is converted into binary data that acts as bits. In the text embedding process in the image, the LSB technique is applied to the LL band.
  • To embed secret images with secret text in an audio cover file, the audio cover must first be converted into 16 bits. The 8 LSB of each symbol can be used for embedding data in it; the characteristics of the audio do not change much, and the sound is clear to the listener. The 8-LSB is utilized for embedding secret images with text by the LSB technique from 1:256 × 256 of the cover audio length.
  • Before embedding the secret audio, it must perform the condition for using the audio compression stage. If the length of the secret audio is greater than the length of the cover audio, different DWT decomposition levels are applied to the secret audio in the audio compression stage. The number of DWT decomposition levels in the audio compression stage is determined according to the length of the cover audio file. So, the length of cover audio must be equal or more than (the length of secret audio + 256 × 256). If the length of the secret audio is lower than the length of the cover audio, the audio compression stage is not performed on the secret audio.
  • The 8-LSB of the audio cover, from (256 × 256 + 1) to the cover audio length, is used for embedding 8-MSB (most significant bit) of secret audio. Finally, the audio steganography file is produced at the transmitter.
  • At the receiver, the inverse operation of the embedding process is applied. To extract the secret audio, the audio steganography must be converted into a binary format, and the extracted 8-LSB acts as 8 MSB of the secret audio to reconstruct the secret audio from its location in the audio steganography.
  • To extract secret images with text, 8-LSB of audio steganography is extracted from its location to reconstruct secret images with text.
  • The text-extracting process from the secret image is performed by applying DWT to the secret image and extracting the LSB from the LL band.

3.2. The Second Proposed Algorithm

The second model, utilizing the DWT technique, is proposed to increase security. A secret image of size 256 × 256 with text and the secret audio are converted into binary bits, and the DWT transform is applied to the cover audio file. Then, the high and low frequencies are produced. The embedded processing is applied to 8-LSB of the low frequencies of the DWT coefficients of the audio cover. The 8 LSB embedding depth is used in the two models. Each audio symbol is converted into 16 bits. The 8 LSB of each symbol can be used for embedding data in it; the audio characteristics do not change much, and the sound is clear to the listener. Figure 3 presents the main steps of Model 2. Algorithms 1 and 2 provide the pseudo code for the proposed procedures at the transmitter and the receiver.
To embed secret multimedia data in an audio cover file, DWT transform is applied to the secret image to prepare secret images with secret text, and four bands appear (LL, LH, HH, HL). After that, the text is converted into binary data that acts as bits. In the text embedding process in the image, the LSB technique is applied to the LL band. The second proposed model uses the following steps to embed secret multimedia data in an audio cover file.
  • After applying DWT on the audio cover file, the two low and high-frequency bands (LL, HH) appear. The coefficients of the LL band are used for the embedding process.
  • To embed a secret image with text, 8-LSB of LL band coefficients from 1:256 × 256 of transformed cover LL length is applied.
  • Before embedding the secret audio, it must meet the condition for using the audio compression stage. If the length of the secret audio is greater than the length of the band of DWT decomposition of the cover audio file, the number of DWT decomposition levels is determined according to the length of the band of DWT decomposition of the cover audio file in the second audio steganography model. So, the length of the band of DWT decomposition of the cover audio file must be equal to or greater than (the length of the secret audio + 256 × 256). If the length of the secret audio is lower than the length of the cover audio, the audio compression stage is not performed on the secret audio
  • 8-LSB of LL band coefficients, from (256 × 256 + 1) to the transformed cover LL length, is utilized for embedding 8-MSB of secret audio.
  • Finally, the audio steganography is constructed after applying inverse DWT (IDWT).
  • At the receiver side, the extraction process is applied to the audio steganography. The DWT transform is applied to the audio steganography, and the LL and HH bands appear.
  • To extract secret images with secret text, 8-LSB of the LL band coefficients is utilized to reconstruct the secret image from its specific location in the LL band.
  • To reconstruct the secret audio, 8-LSB of the LL band, from its specific location of LL band length, acts as the 8-MSB of the secret audio.
  • The text-extracting process from the secret image is performed by applying DWT on the secret image and extracting the LSB from the LL band.
Algorithm 1: Embedding procedure of the proposed approach (Transmitter side)
1:Input:
2: I: Cover image (uint8), size H × W
3: M: Binary payload
4: K: Secret key
5: params: {d, DWTL, Sub bands, α, Nsubcarriers, CC, inter leaver}
6:Output:
7:   Istego: Stego image (uint8), size H × W
8:Mcrc  ←  CRC.APPEND(M)
9:Mfec  ←  FEC.ENCODE_CC(Mcrc, CC = (1, 2, 7))
10:Xofdm   ← OFDM.MODULATE(Xsym, N = Nsubcarriers)
11:Lbits, Tbits  ←  SPLIT(Xofdm, ratio = rspatial:rwavelet)
12:C0  ← DWT.DECOMPOSE(I, levels = DWTL)
13:Ω  ← SELECT_COEFF_INDICES(C0, Sub bands, policy = “mid-band, energy-capped”, seed = rand(K,“dwtidx”), |Tbits|)
14:For (i, idx) in enumerate(Ω) do
15: c  ←  C0[idx]
16: b  ← Tbits[i]
17: C0[idx]← c + α * SIGN_FOR_EMBED(b, policy = “LSB-on-coeff”)
18:End For
19:I′  ←  DWT.RECONSTRUCT(C0)
20:Π  ← SELECT_PIXEL_INDICES(I′, strategy = “blue-channel-first or grayscale-scan”, seed = rand(K,“LSBidx”), count = ceil(|Lbits|/d))
21:j ← 0
22:For each p in Π do
23: seg  ← Lbits[j:j + d]
24: v  ← PIXEL_VALUE(I′, p)
25: vnew  ← (v & ~((1 << d)−1)) | BITS_TO_INT(seg)
26: SET_PIXEL(I′, p, v_new)
27: j ← j + d
28: if j ≥ |Lbits| then break
29:End For
30:Istego    ← CLIP_TO_UINT8(I′)
31:assert PAYLOAD_CAPACITY_OK(I, d, |Lbits|)
32:Return Istego
33:End
Algorithm 2: Extraction procedure of the proposed approach (Receiver side)
1:Input:
2: Istego: Stego image (uint8), size H × W
3: K: Secret key
4: params: {d, DWTL, Sub bands, α, Nsubcarriers, CC, inter leaver}
5:Output:
6: Mhat: Recovered payload (binary)
7: status: {OK | CRC_FAIL}
8:C1  ← DWT.DECOMPOSE(Istego, levels = DWTL)
9:Ω  ← RESELECT_COEFF_INDICES(C1, Sub bands, seed = rand(K,“dwtidx”))
10: T b i t s ´ ← []
11:For idx in Ω do
12: c  ←  C1[idx]
13: bhat ← DETECT_BIT_FROM_COEFF(c, policy = “sign-threshold”, α)
14: APPEND( T b i t s ´ , bhat)
15:End For
16:Π ← RESELECT_PIXEL_INDICES(strategy = “blue-channel-first or grayscale-scan”, eed = rand(K,“LSBidx”), count = ceil(|image|/stride))
17: L ^ b i t s   ← []
18:For each p in Π do
19: v ← PIXEL_VALUE(Istego, p)
20: seg  ← INT_TO_BITS(v & ((1 << d)−1), d)
21: APPEND( L ^ b i t s , seg)
22: if | L ^ b i t s | ≥ EXPECTED_LSB_LENGTH then break
23:End For
24: X o f d m ´ ← MERGE( L ^ b i t s , T b i t s ´ , ratio = rspatial:rwavelet)
25: X s y m ´ ← OFDM.DEMODULATE( X o f d m ´ , N = Nsubcarriers)
26: S ´ ← DEINTERLEAVE( S ^ π, π = BUILD_INTERLEAVER(rand(K,“π”), length(π)))
27: M f e c ´     ← FEC.DECODE_CC( S ^ , CC = (1, 2, 7))
28: X c r c ´ , ok ← CRC.CHECK_AND_STRIP( M f e c ´ )
29:IF ok = TRUE then
30: Return X c r c ´ , OK
31:else
32: Return X c r c ´ , CRC_FAIL
33:End IF
34:End

4. Experimental Evaluation

4.1. Simulation Setup

This section investigates the performance of the two proposed secured models (multimedia data/audio) over the AWGN channel. Table 2 provides the tested samples used for evaluating the two models under different conditions of wireless channels with/without error correction using convolutional encoding. The samples used are a medical image with size (256 × 256), three natural/texture images, and five audio files (secret and cover) with different bit rates and sampled rates, .wav audio, which is used as a cover file with a bit rate = 705 kbps and sample rate = 44.1 kHz, .mp3 audio, which is used as a cover file with bit rate = 63 kbps and sample rate = 44.1 kHz, and the embedded secret. Table 3 provides the simulation parameters used for evaluating the two models in classifying different modulations over various wireless communication channels. The correlation coefficient (Cr) is used as the performance metric to assess the efficiency of the proposed models.
The correlation coefficient measures the correlation value between the reconstructed and the original audio, where a higher correlation coefficient means that the reconstructed audio has less difference from the original audio. This measure is given mathematically as follows.
c r W , W ^ = y W ( y ) W ^ ( y ) y W 2 ( y ) y W ^ 2 ( y ) ,
where W and W ^ are the original and reconstructed audios, respectively.
The comparison between the stego audio and the cover audio is based on the PSNR metric, which is expressed in decibels (dB). The PSNR value must be high to obtain a high audio quality. This metric is used to compare the steganography audio and the cover audio. It is related to imperceptibility, where a higher PSNR means a higher imperceptibility.
P S N R ( d B ) = 10 log 10 25 5 2 1 M × N 0 M 1 0 N 1 ( A w x A ( x ) ) 2 ,
where A(x) is the original audio, Aw(x) is the reconstructed audio, and M and N are the width and the height of the audio, respectively.

4.2. Key Parameters Justifications

The performance of the proposed hybrid LSB–DWT scheme designed with channel coding and OFDM modulation is greatly dependent on certain design parameters. For transparency and reproducibility, this subsection provides the reasons for selecting the three significant parameters: the embedding depth in the LSB stage, the convolutional coding setup, and the OFDM subcarriers count.
(1)
LSB embedding depth
The embedding depth determines the extent of alteration of each pixel’s least significant bit during data hiding. Shallow embedding (e.g., 1–4 LSBs) ensures high perceptual imperceptibility but restricts payload capacity, whereas deeper embedding allows increased payload at the expense of distortion. Preliminary investigations indicated that 8-LSB embedding offers PSNR higher than 40 dB, imperceptible to the eye, while increasing payload capacity. This tradeoff between capacity and invisibility motivated the choice of 8-LSB depth.
(2)
Convolutional code (1, 2, 7)
Error-correction codes enhance robustness by reconstructing information in noisy, compressed, or channel-degraded environments. The (1, 2, 7) generator polynomial-based convolutional code was selected due to its proven error correction capability in wireless communication protocols and digital watermarking research. The code is capable of correcting burst errors with low complexity in decoding, making it useful in real-time systems.
(3)
Number of OFDM subcarriers
The number of OFDM subcarriers has an influence on robustness as well as computational complexity. Higher subcarriers boost multipath fading and compression artifact resistance but elevate computational complexity. The configuration is a fair trade-off between frequency diversity for robustness and computational feasibility. Moreover, the selection corresponds to widely used subcarrier configurations in IEEE 802.11 (Wi-Fi) and LTE standards to make the scheme compatible with realistic communication system parameters.

4.3. Performance Investigation of the Two Secured Models over AWGN

In this section, the performance of the two proposed models is evaluated over the AWGN channel without error correction (FEC), as shown in Table 4 and Table 5, respectively, with different SNRs. As the results in Table 4 and Table 5 clearly show, the evaluation was bad, so it was necessary to utilize one of the error correction methods. Convolutional encoding is considered the best error correction, so it is used in communication systems, as shown in Figure 4. The considered convolutional coding (1, 2, 7) is expressed as (n, k, K) and (k/n, K). The code rate is defined by k/n, which determines the number of data bits per coded bit. The integer K parameter defines the constraint length.
The advantages of using convolutional coding are that it has memory in the encoder, and convolutional coding is considered a suitable technique such that long sequences of information symbols can be encoded in serial form. It is suitable for burst errors and is preferred in non-systematic form. The coding rate and the constraint length evaluate the performance of the convolutional coding. If the constraint length is long, it is possible to obtain a more powerful code and more coding gain. On the other hand, a smaller coding rate (k/n), a more powerful code due to extra redundancy, and less bandwidth efficiency will be obtained.
It is clear from Table 6 and Table 7 that convolutional encoding performed well in transmitting secret multimedia data embedded in a cover audio file. Without utilizing the error correction method, the quality of the received and reconstructed secret (audio and text) by utilizing two secured models, as shown in Table 4 and Table 5, was poor at SNR = 5 dB. Model 1 was better than Model 2 in transmitting audio steganography, as convolutional encoding was used to enhance the performance of the two models. A comparison between them with convolutional coding showed that Model 1 had a higher performance than Model 2, which was clear from the reconstructed multimedia data. The convolutional encoding investigated the performance for two secured models, as shown in Table 6 and Table 7.
As a comparison between secured Model 1 with and without error correction at SNR = 5 dB, as shown in Table 4 and Table 6, the Cr of received audio, extracted secret audio, and extracted secret image increased by 185.78%, 23.15%, and 1.29%, and the Cr of the extracted text equaled one. As a comparison between secured Model 2 with and without error correction method at SNR = 5 dB, as shown in Table 5 and Table 7, the Cr of the received audio, extracted secret audio, and extracted secret image increased by 227.5%,128.5%, and 23.3% respectively, and the Cr of extracted text equaled one.
The previous results show that convolutional encoding had a better effect on secured Model 2 than secured Model 1. Figure 5 and Figure 6 compare the two proposed models. The first proposed model had a better evaluation than the second one over the AWGN channel. As shown in Figure 5a, where the audio steganography from secured Model 1 was transmitted without error correction, the quality of reconstructed multimedia data would have been fine at SNR ≥ 10 dB, where the value of Cr was close to 1. If the secured Model 1 had been transmitted with an error correction scheme (convolutional encoding), the quality of reconstructed multimedia data would have been fine at SNR ≥ 5 dB, where the value of Cr was close to 1, as shown in Figure 5b.
On the other hand, in Figure 6a, the audio steganography secured Model 2 was transmitted without error correction. The quality of reconstructed multimedia data would have been fine at SNR ≥ 10 dB, but the Cr of extracted text became 1 at SNR ≥ 5 dB. If the secured Model 2 had been transmitted with an error correction scheme (convolutional encoding), the quality of reconstructed multimedia data would have been fine at SNR ≥ 5 dB, where the value of Cr was close to 1, as shown in Figure 6b. Based on these results, secured Model 1 is considered better than secured Model 2.

5. Performance Evaluation Using OFDM System

This section evaluates the performance of the proposed models using the OFDM system in terms of their suitability and applicability. The aim of designing an OFDM system was to overcome the bad effects of multipath fading because of the use of a high data rate transmission technique. In OFDM, the data are transmitted through subcarriers. Because of its orthogonally, the interference is decreased, and successful detection may be achieved. To evaluate the performance and behavior of the two proposed security models, different wireless channels (AWGN, Rayleigh fading, and Stanford University Interim (SUI-6)) [26,27] were applied to them through an OFDM system. A comparison between utilizing OFDM modulation only and merging two modulations (OFDM-DPSK) is presented below.

5.1. Performance Evaluation Using OFDM over AWGN Channel

It was necessary to examine the proposed two secured models through the OFDM system over the AWGN channel. At first, the two secured models were evaluated using OFDM modulation only. The Cr metric was used for evaluating the quality of reconstructed multimedia data. Figure 7 and Figure 8 present the behavior of the secured Models 1 and 2, where the extracted text is presented in Table 8. In our tests utilizing OFDM modulation, secured Model 1 was found to be a little better than secured Model 2.
The performance of two secured models through the OFDM system could be enhanced by applying an error correction scheme (convolutional encoding) and merging OFDM and DPSK modulations. Table 9 and Table 10 summarizes the performance of the proposed secure models through the OFDM and DPSK modulations. Table 11 shows the extracted text from a medical image using the secured Model 1. The Cr coefficient was used for measuring the evaluation, where the enhancement process in the OFDM system gave us a better assessment of two secured models than utilizing OFDM modulation only. The experimental results from using secured Model 1 showed that the Cr of the received audio, extracted secret audio, and extracted secret image increased by 43.8%, 3.7%, and 3.13% respectively, and extracted text was better than the experimental results yielded from using secured Model 2 at SNR = 5 dB over AWGN channel.
Figure 9 and Figure 10 present the behavior of the secured models using the enhanced OFDM system. It is shown that the Cr of reconstructed multimedia data from secured Model 1 was improved over that of reconstructed multimedia data from secured Model 2 at SNR = 0 dB. In other words, the first secured model was found to be more resistant to AWGN channel noise through the enhanced OFDM than Model 2.

5.2. Performance Evaluation Using OFDM over Fading Channel

The aim of utilizing OFDM is to compensate for the error resulting from multi-path fading because of the high data rate transmission technique, which transmits the data simultaneously through sub-carriers. The decreasing interference and successful detection were achieved by its orthogonal over multi-path and noisy channel [9]. This section evaluates the performance of transmitting the two proposed audio steganography models over a fading channel (Jakes model) at two different positions. As a result of the destructive effect of the multipath fading channel, the behavior of secured Model 1 and secured Model 2 was extremely poor with OFDM modulation only, as shown in Figure 11 and Figure 12, respectively. To improve the behavior of two secure models over a fading channel, the enhanced OFDM system (merging OFDM and DPSK modulations with convolutional encoding) had to be performed, as shown in Figure 13 and Figure 14. Figure 13 represents the behavior of secured Model 1, and Figure 14 represents the behavior of secured Model 2.
The performance of the first model based on the LSB technique was better than that of the second proposed model based on DWT and LSB at SNR 0 dB. The extracted text found in Table 12 shows the performance of the secured Model 1 with the enhanced OFDM system. The extracted text would have been fine at SNR ≥ 5 dB. The experimental results in Table 13 show the performance of two secured models with enhanced OFDM over a fading channel at SNR = 5 dB. The two secured models worked well with an improved OFDM system (merging OFDM and DPSK modulations and convolutional encoding) over a fading channel.
The Cr of the received audio, extracted secret audio, and extracted secret image using Model 1 increased by 11.18%, 2.8%, and 11.2% respectively, compared secured Model 2, so secured Model 1 had a higher immunity to noise than secured Model 2 over a fading channel. Model 1, based on LSB, achieved a higher immunity to noise; however, using the DWT transformer in Model 2 increased the security of transmitted audio steganography, as shown in Figure 10 and Figure 14. Thus, it is preferred to use the secure Model 2 to achieve high security.

5.3. Performance Evaluation Using OFDM over SUI6 Channel

This section describes the use of the third wireless communication SUI-6 (Stanford University Interim) to evaluate the performance of the two proposed secure models on an OFDM system at various SNRs at two different positions. The behavior of secured Model 1 and secured Model 2 utilizing OFDM modulation is presented in Figure 15 and Figure 16, respectively, over the SUI-6 channel. The noise of the SUI-6 channel had a bad effect on the two secured models, as was clear from the Cr of the reconstructed multimedia data. As a result of applying the enhanced OFDM system, the performance of the two secure models was improved, as shown in Figure 17 and Figure 18.
As shown in Figure 17, the Cr of the stego image (secret image with text) equaled 0.3 at SNR = 0 dB, and the Cr of the reconstructed multimedia data was close to 1 at SNR ≥ 5 dB. However, utilizing secured Model 2 with an enhanced OFDM system had a bad effect on received audio (Cr = 0) over the SUI-6 channel, as shown in Figure 18, although the extracted text would have been fine at SNR ≥ 10 dB. In this comparison of the performance of two secured models over the SUI-6 channel, secured Model 1 worked better than secured Model 2. Table 14 provides the quality of the reconstructed images produced from audio steganography Model 1 over AWGN, Fading, and SUI6 channels using different matrices (Cr, BER, and PSNR).

5.4. Attack Resistance Investigation

The security of a watermarking or data-hiding system is typically quantified in terms of resistance to detection, survivability against malicious or unintentional attacks, and theoretical adherence to security frameworks. This section analyzes the suggested hybrid LSB–DWT solution in terms of these parameters.
A steganographic system is secure if the statistical behavior of stego-objects is unidentifiable with that of cover objects. On this basis, the hybrid LSB–DWT method extends undetectability by distributing modifications across the spatial and frequency domains in order to minimize statistical fluctuations potentially revealing hidden information. The security of a system cannot rest on the secrecy of the algorithm, but rather, on the secrecy of the key [28]. The proposed method adheres to this principle, because embedding points in both the LSB and DWT domains are governed by a secret key, so the system can still be resilient even if the embedding process is revealed. Below, we investigate the performance of the proposed approach against common attacks.
(1)
Statistical detection attacks
LSB techniques are prone to histogram analysis, chi-square testing, and RS steganalysis. Placing half of the data in the DWT domain reduces statistical regularities in the spatial domain, considerably increasing undetectability.
(2)
Differential attacks
Differential steganalysis involves comparing the cover and modified media to identify the modifications. Information in the present uplim scheme is distributed between the LSB and DWT domains, and therefore, manipulations in only one such domain (e.g., pixel-level differencing) are not able to expose the total hidden information. This two-domain distribution increases protection from differential attacks [29].
(3)
Compression attacks
JPEG compression is a common real-world attack that reduces redundancy and can destroy spatial-domain LSB embeddings. Embedding in selected DWT sub-bands gives some resilience, since low-frequency terms remain stable under compression.
(4)
Filtering and noise attacks
Spatial filtering (e.g., Gaussian blurring) and additive noise can disrupt LSB embedding. However, wavelet-domain embeddings, especially within mid- and low-frequency sub-bands, are more robust, allowing recovery of data under tolerable distortions.
(5)
Geometric attacks
Most watermarking schemes face challenges with rotation, cropping, and scaling. Although hybrid approaches are not completely exempt from this limitation, the inclusion of DWT coefficients balances the limitation to a considerable degree, as transform-domain features are relatively robust to geometric transformations [30].
The hybrid method offers enhanced security through the combination of the high embedding capacity and stealth of LSB with DWT’s power. In adherence to the theoretical security frameworks, the proposed method offers the following.
  • Better undetectability against statistical analysis.
  • Security against differential and compression attacks.
  • Resilience against common noise and filtering processes.
  • A good balance between efficiency, strength, and security.
Table 15 provides a theoretical comparative security and attack resistance analysis of LSB, DWT, and hybrid schemes. Future work will extend this analysis through empirical testing against advanced machine learning-based steganalysis, representing the state of the art in attack models.

5.5. Discussion

Digital watermarking and steganographic methods are generally evaluated based on capacity, imperceptibility, robustness, and computational complexity. Single-domain methods, such as LSB substitution and DWT-based embedding, each have their strengths and weaknesses. The motivation to combine LSB and DWT comes from the desire to take advantage of their complementary strengths and diminish their weaknesses. LSB substitution operates directly in the spatial domain by modifying the least significant bits of the pixel values. This gives very high embedding capacity and imperceptible distortion, but the method is not robust to image processing operations such as compression, filtering, and noise addition. It is, however, straightforward and thus computationally efficient.
Transform-domain techniques, such as DWT, embed data in frequency coefficients. These approaches are more robust to compression, geometric attacks, and filtering due to frequency localization. They nonetheless possess lower embedding capacity compared to LSB and are computationally more demanding. The novel hybrid method embeds a portion of the data in the DWT coefficients to exploit robustness, along with LSB replacement in the spatial domain for increased embedding capacity and imperceptibility. The fusion balances these trade-offs, providing a more favorable robustness–capacity–imperceptibility trade-off profile with low computational overhead. Table 16 provides a comparative analysis of the LSB, DWT, and hybrid LSB–DWT approaches.
In previous works, researchers were interested in designing different audio steganography techniques without applying wireless channels. These techniques concealed multimedia data in an audio cover. Recently, some audio steganography techniques were performed using an AWGN wireless channel, as clarified in Table 17. Two proposed secure models investigated high embedding capacity with different concealed multimedia data (text, secret image, secret audio). In our research, as a result of a comparative study, as shown in Table 17, secured Model 1 was found to have excellent performance for reconstructed (secret image, secret audio) and cover audio.

6. Conclusions

This work proposed two secure models of audio steganography. This research aimed to embed multimedia secret data (text-image-audio) in a cover audio file with high confidentiality. The two proposed models are based on the LSB and DWT methods, the most famous types of audio steganography. In the second part of this research paper, convolutional encoding (1, 2, 7) was applied to the transmitted signal to enhance the transmission process and make it more immune to noise. Different modulation cases (BPSK, OFDM, OFDM + DPSK) were used on two transmitted secure models over various wireless channel models to classify their quality and immunity to noise. Merging two modulations, OFDM and DPSK, with convolutional encoding improved the transmitted audio steganography, resulting in two secured models over various wireless channels (AWGN—Fading –SUI-6). In order to yield higher quality for audio steganography, it is preferred to utilize Model 1, based on the LSB method. To achieve higher security with lower quality for audio steganography, it is preferred to use Model 2, based on the LSB and DWT methods.

Author Contributions

Conceptualization, A.A.H., A.A.E., M.I.A., M.E., A.A.S.A., A.A.A., and R.A.E.; methodology, A.A.H., A.A.E., M.I.A., M.E., A.A.S.A., A.A.A., and R.A.E.; formal analysis, A.A.H., A.A.E., M.I.A., M.E., A.A.S.A., A.A.A., and R.A.E.; investigation, A.A.H., A.A.E., M.I.A., A.A.A., and R.A.E.; resources, A.A.H., A.A.E., M.I.A., and R.A.E.; writing—original draft preparation, A.A.H., A.A.E., M.I.A., and R.A.E.; writing—review and editing, M.E., A.A.S.A., and A.A.A.; supervision, M.I.A.; project administration, R.A.E.; funding acquisition, A.A.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Prince Sultan University.

Data Availability Statement

The data are contained within the article and/or available from the corresponding author upon reasonable request.

Acknowledgments

The authors would like to thank Prince Sultan University for paying the Article Processing Charges (APC) for this work.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Wendzel, S.; Caviglione, L.; Mazurczyk, W.; Mileva, A.; Dittmann, J.; Krätzer, C.; Lamshöft, K.; Vielhauer, C.; Hartmann, L.; Keller, J.; et al. A Generic Taxonomy for Steganography Methods. ACM Comput. Surv. 2025, 57, 1–37. [Google Scholar] [CrossRef]
  2. Nasr, M.A.; El-Shafai, W.; El-Rabaie, E.-S.M.; El-Fishawy, A.S.; El-Hoseny, H.M.; Abd El-Samie, F.E.; Abdel-Salam, N. A Robust Audio Steganography Technique Based on Image Encryption Using Different Chaotic Maps. Sci. Rep. 2024, 14, 22054. [Google Scholar] [CrossRef] [PubMed]
  3. Aravind Krishnan, A.; Ramesh, Y.; Urs, U.; Arakeri, M. Audio-in-Image Steganography Using Analysis and Resynthesis Sound Spectrograph. IEEE Access 2025, 13, 75184–75193. [Google Scholar] [CrossRef]
  4. Nasr, M.A.; El-Shafai, W.; El-Rabaie, E.-S.M.; El-Fishawy, A.S.; Abdel-Salam, N.; El-Samie, F.E.A. Robust and Secure Systems for Audio Signals. J. Electr. Syst. Inf. Technol. 2025, 12, 31. [Google Scholar] [CrossRef]
  5. Su, W.; Ni, J.; Hu, X.; Li, B. Efficient Audio Steganography Using Generalized Audio Intrinsic Energy with Micro-Amplitude Modification Suppression. IEEE Trans. Inf. Forensics Secur. 2024, 19, 6559–6572. [Google Scholar] [CrossRef]
  6. Helmy, M. Audio Transmission Based on Hybrid Crypto-Steganography Framework for Efficient Cyber Security in Wireless Communication System. Multimed. Tools Appl. 2024, 84, 18893–18917. [Google Scholar] [CrossRef]
  7. Fadhil, A.M.; Jaber, A.Y. Securing Communication Channels: An Advanced Steganography Approach with Orthogonal Frequency Division Multiplexing (OFDM). J. Electr. Comput. Eng. 2025, 2025, 2468585. [Google Scholar] [CrossRef]
  8. Aldababsa, M.; Özyurt, S.; Kurt, G.K.; Kucur, O. A Survey on Orthogonal Time Frequency Space Modulation. IEEE Open J. Commun. Soc. 2024, 5, 4483–4518. [Google Scholar] [CrossRef]
  9. Huang, G.; Zhang, K.; Zhang, Y.; Liao, K.; Jin, S.; Ding, Y. Orthogonal Frequency Division Multiplexing Directional Modulation Waveform Design for Integrated Sensing and Communication Systems. IEEE Internet Things J. 2024, 11, 29588–29599. [Google Scholar] [CrossRef]
  10. Jiang, S.; Wang, W.; Miao, Y.; Fan, W.; Molisch, A.F. A Survey of Dense Multipath and Its Impact on Wireless Systems. IEEE Open J. Antennas Propag. 2022, 3, 435–460. [Google Scholar] [CrossRef]
  11. Mahmood, A.; Khan, S.; Hussain, S.; Zeeshan, M. Performance Analysis of Multi-User Downlink PD-NOMA under SUI Fading Channel Models. IEEE Access 2021, 9, 52851–52859. [Google Scholar] [CrossRef]
  12. Imoize, A.L.; Ibhaze, A.E.; Atayero, A.A.; Kavitha, K.V.N. Standard Propagation Channel Models for MIMO Communication Systems. Wirel. Commun. Mob. Comput. 2021, 2021, 8838792. [Google Scholar] [CrossRef]
  13. Li, Y.; Chen, K.; Wang, Y.; Zhang, X.; Wang, G.; Zhang, W.; Yu, N. CoAS: Composite Audio Steganography Based on Text and Speech Synthesis. IEEE Trans. Inf. Forensics Secur. 2025, 20, 5978–5991. [Google Scholar] [CrossRef]
  14. Khan, S.; Abbas, N.; Nasir, M.; Haseeb, K.; Saba, T.; Rehman, A.; Mehmood, Z. Steganography-Assisted Secure Localization of Smart Devices in Internet of Multimedia Things (IoMT). Multimed. Tools Appl. 2021, 80, 17045–17065. [Google Scholar] [CrossRef]
  15. Almomani, I.; Alkhayer, A.; El-Shafai, W. A Crypto-Steganography Approach for Hiding Ransomware within HEVC Streams in Android IoT Devices. Sensors 2022, 22, 2281. [Google Scholar] [CrossRef] [PubMed]
  16. Wang, J.; Wang, K. A Novel Audio Steganography Based on the Segmentation of the Foreground and Background of Audio. Comput. Electr. Eng. 2025, 123, 110026. [Google Scholar] [CrossRef]
  17. Ali, B.Q. Covert Voip Communication Based on Audio Steganography. Int. J. Comput. Digit. Syst. 2022, 11, 821–830. [Google Scholar] [CrossRef]
  18. Hameed, A.S. A High Secure Speech Transmission Using Audio Steganography and Duffing Oscillator. Wirel. Pers. Commun. 2021, 120, 499–513. [Google Scholar] [CrossRef]
  19. Mahmoud, M.M.; Elshoush, H.T. Enhancing LSB Using Binary Message Size Encoding for High Capacity, Transparent and Secure Audio Steganography—An Innovative Approach. IEEE Access 2022, 10, 29954–29971. [Google Scholar] [CrossRef]
  20. Kasetty, P.K.; Kanhe, A. Covert Speech Communication through Audio Steganography Using DWT and SVD. In Proceedings of the 2020 11th International Conference on Computing, Communication and Networking Technologies (ICCCNT), Kharagpur, India, 1–3 July 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1–5. [Google Scholar]
  21. Abdulrazzaq, S.T.; Siddeq, M.M.; Rodrigues, M.A. A Novel Steganography Approach for Audio Files. SN Comput. Sci. 2020, 1, 97. [Google Scholar] [CrossRef]
  22. Gao, S.; Ding, S.; Ho-Ching Iu, H.; Erkan, U.; Toktas, A.; Simsek, C.; Wu, R.; Xu, X.; Cao, Y.; Mou, J. A Three-Dimensional Memristor-Based Hyperchaotic Map for Pseudorandom Number Generation and Multi-Image Encryption. Chaos 2025, 35, 073105. [Google Scholar] [CrossRef]
  23. Gao, S.; Ho-Ching Iu, H.; Erkan, U.; Simsek, C.; Toktas, A.; Cao, Y.; Wu, R.; Mou, J.; Li, Q.; Wang, C. A 3D Memristive Cubic Map with Dual Discrete Memristors: Design, Implementation, and Application in Image Encryption. IEEE Trans. Circuits Syst. Video Technol. 2025, 35, 7706–7718. [Google Scholar] [CrossRef]
  24. Saranya, S.S.; Reddy, P.L.C.; Prasanth, K. Digital Audio Steganography Using LSB and RC7 Algorithms for Security Applications. In AIP Conference Proceedings, Proceedings of the 4th International Conference on Internet of Things 2023: ICIoT2023, Kattankalathur, India, 26–28 April 2023; AIP Publishing: New York, NY, USA, 2024; Volume 3075, p. 020083. [Google Scholar]
  25. Singha, A.; Ullah, M.A. Development of an Audio Watermarking with Decentralization of the Watermarks. J. King Saud Univ.—Comput. Inf. Sci. 2022, 34, 3055–3061. [Google Scholar] [CrossRef]
  26. Anwar, M.; Sarosa, M.; Rohadi, E. Audio Steganography Using Lifting Wavelet Transform and Dynamic Key. In Proceedings of the 2019 International Conference of Artificial Intelligence and Information Technology (ICAIIT), Yogyakarta, Indonesia, 13–15 March 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 133–137. [Google Scholar]
  27. Indrayani, R. Modified LSB on Audio Steganography Using WAV Format. In Proceedings of the 2020 3rd International Conference on Information and Communications Technology (ICOIACT), Yogyakarta, Indonesia, 24–25 November 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 466–470. [Google Scholar]
  28. Anitha, M.; Azhagiri, M. Uncovering the Secrets of Stegware: An in-Depth Analysis of Steganography and Its Evolving Threat Landscape. In Human Machine Interaction in the Digital Era; CRC Press: London, UK, 2024; pp. 277–282. ISBN 9781003428466. [Google Scholar]
  29. Biryukov, A. Impossible Differential Attack. In Encyclopedia of Cryptography, Security and Privacy; Springer Nature: Cham, Switzerland, 2025; pp. 1188–1189. ISBN 9783030715205. [Google Scholar]
  30. Su, Q.; Liu, D.; Sun, Y. A Robust Adaptive Blind Color Image Watermarking for Resisting Geometric Attacks. Inf. Sci. 2022, 606, 194–212. [Google Scholar] [CrossRef]
  31. Kasban, H.; Nassar, S.; El-Bendary, M.A.M. Medical Images Transmission over Wireless Multimedia Sensor Networks with High Data Rate. Analog Integr. Circuits Signal Process. 2021, 108, 125–140. [Google Scholar] [CrossRef]
Figure 1. Description of the audio steganography process.
Figure 1. Description of the audio steganography process.
Jsan 14 00106 g001
Figure 2. Embedding and extracting of Model 1.
Figure 2. Embedding and extracting of Model 1.
Jsan 14 00106 g002
Figure 3. Embedding and extracting of Model 2.
Figure 3. Embedding and extracting of Model 2.
Jsan 14 00106 g003
Figure 4. Transmitting the two models over an AWGN channel.
Figure 4. Transmitting the two models over an AWGN channel.
Jsan 14 00106 g004
Figure 5. (a) Transmitting multimedia data without FEC using Model 1 over an AWGN channel. (b) Transmitting multimedia data with FEC using Model 1 over an AWGN channel.
Figure 5. (a) Transmitting multimedia data without FEC using Model 1 over an AWGN channel. (b) Transmitting multimedia data with FEC using Model 1 over an AWGN channel.
Jsan 14 00106 g005
Figure 6. (a) Transmitting multimedia data without FEC using Model 2 over an AWGN channel, (b) Transmitting multimedia data with FEC using Model 2 over an AWGN channel.
Figure 6. (a) Transmitting multimedia data without FEC using Model 2 over an AWGN channel, (b) Transmitting multimedia data with FEC using Model 2 over an AWGN channel.
Jsan 14 00106 g006
Figure 7. Cr of the reconstructed multimedia data from Model 1 over the AWGN channel using OFDM modulation.
Figure 7. Cr of the reconstructed multimedia data from Model 1 over the AWGN channel using OFDM modulation.
Jsan 14 00106 g007
Figure 8. Cr of the reconstructed multimedia data from Model 2 over the AWGN channel by using OFDM modulation.
Figure 8. Cr of the reconstructed multimedia data from Model 2 over the AWGN channel by using OFDM modulation.
Jsan 14 00106 g008
Figure 9. Cr of the reconstructed multimedia data from Model 1 over AWGN channel using (OFDM + DPSK) modulations.
Figure 9. Cr of the reconstructed multimedia data from Model 1 over AWGN channel using (OFDM + DPSK) modulations.
Jsan 14 00106 g009
Figure 10. Cr of reconstructed multimedia data from Model 2 over AWGN channel using (OFDM + DPSK) modulations.
Figure 10. Cr of reconstructed multimedia data from Model 2 over AWGN channel using (OFDM + DPSK) modulations.
Jsan 14 00106 g010
Figure 11. Cr of the reconstructed multimedia data from Model 1 over a fading channel through OFDM modulation.
Figure 11. Cr of the reconstructed multimedia data from Model 1 over a fading channel through OFDM modulation.
Jsan 14 00106 g011
Figure 12. Cr of the reconstructed multimedia data from Model 2 over a fading channel through OFDM modulation.
Figure 12. Cr of the reconstructed multimedia data from Model 2 over a fading channel through OFDM modulation.
Jsan 14 00106 g012
Figure 13. Cr of the reconstructed multimedia data from Model 1 over a fading channel through OFDM_DPSK modulations.
Figure 13. Cr of the reconstructed multimedia data from Model 1 over a fading channel through OFDM_DPSK modulations.
Jsan 14 00106 g013
Figure 14. Cr of the reconstructed multimedia data from Model 2 over a fading channel through OFDM_DPSK modulations.
Figure 14. Cr of the reconstructed multimedia data from Model 2 over a fading channel through OFDM_DPSK modulations.
Jsan 14 00106 g014
Figure 15. Cr of the reconstructed multimedia data from Model 1 over the SUI6 channel through OFDM modulation.
Figure 15. Cr of the reconstructed multimedia data from Model 1 over the SUI6 channel through OFDM modulation.
Jsan 14 00106 g015
Figure 16. Cr of the reconstructed multimedia data from Model 2 over the SUI6 channel through OFDM modulation.
Figure 16. Cr of the reconstructed multimedia data from Model 2 over the SUI6 channel through OFDM modulation.
Jsan 14 00106 g016
Figure 17. Cr of the reconstructed multimedia data from Model 1 over the SUI6 channel through OFDM_DPSK modulations.
Figure 17. Cr of the reconstructed multimedia data from Model 1 over the SUI6 channel through OFDM_DPSK modulations.
Jsan 14 00106 g017
Figure 18. Cr of the reconstructed multimedia data from Model 2 over the SUI6 channel through OFDM_DPSK modulation.
Figure 18. Cr of the reconstructed multimedia data from Model 2 over the SUI6 channel through OFDM_DPSK modulation.
Jsan 14 00106 g018
Table 1. Comparison of existing audio steganography techniques.
Table 1. Comparison of existing audio steganography techniques.
Ref. Techniques Channels Quality
[26]LWT + a dynamic key for encrypting--Stego audio is similar to the original audio.
[27](message in cover audio) by different LSB--BER (%) from 0.00441 to 0.00507.
[18]Different capacities of secret audio in cover audio by contourlet transform and duffing oscillator)--Cr of full size of retrieve secret speech = 0.8529,
Cr of full size of stego speech = 0.9999
[19]Message in audio file by (The LSB-BMSE (Binary Message Size Encoding))--Histogram error rate for stego Jazz = 2.86 × 10−7
Using a 100 KB secret message
[20](Secret audio in cover audio) by 5TH DWT + SVDAWGNPSNR = 34.67
[21]Compressed image by GMPR in an audio file using (DCT + DWT)--RMSE for decompressed Lena image = 4.3
[25]Multiple image watermarks in audio by (DWT + SVD)--Cr (host File) = 0.9682
Cr (1st watermark) = 0.9803
Cr (2nd watermark) = 0.9963
Cr (3rd watermark) = 0.9982
Cr (4th watermark) = 0.9999
Table 2. The test samples used in the experiments.
Table 2. The test samples used in the experiments.
Cover Audio Secret _Text Image Secret Audio
Jsan 14 00106 i001Jsan 14 00106 i002Jsan 14 00106 i003
Bit rate = 705 kbps (.wav)
Sample Rate = 44.1 KHz
-Bit rate = 705 kbps (.wav)
Sample Rate = 44.1 KHz
Jsan 14 00106 i004Jsan 14 00106 i005Jsan 14 00106 i006
Bitrate = 63 kbps (.mp3)
Sample Rate = 44.1 KHz
-Bit rate = 8 kbps (.mp3)
Sample Rate = 8 KHz
- Jsan 14 00106 i007Jsan 14 00106 i008
-Jsan 14 00106 i009Bit rate = 47 kbps (.mp3)
Sample Rate = 32 KHz
Secret text: In the Name of Allah, the name of the case is Asmaa Abdelmonem Eyssa, the age is 34 years.
Table 3. Simulation setting parameters for evaluating two models over various wireless communications channels.
Table 3. Simulation setting parameters for evaluating two models over various wireless communications channels.
Parameter Value
Doppler shift frequency250 Hz
Wireless communications channelsAWGN
Rayleigh fading
SUI-6 model
CodingConvolutional encoding (1, 2, 7)
ModulationBPSK
OFDM (OFDM subcarriers are 128 and
number of symbols = 2)
OFDM + DPSK
SNR0 to 35 dB
Metrics of the performanceCorrelation coefficient (Cr)
Table 4. Reconstructed secret multimedia data without FEC using Model 1 [SNR = 5 dB].
Table 4. Reconstructed secret multimedia data without FEC using Model 1 [SNR = 5 dB].
Jsan 14 00106 i010Jsan 14 00106 i011Jsan 14 00106 i012
Cr = 0.3582, BER = 0.1790 (audio.wav)Cr = 0.4365, BER = 0.3790 (audio.wav)Cr = 0.8031, BER = 0.2524
Extracted text = “In The” (Cr = 0, BER = 0.0619)
Jsan 14 00106 i013Jsan 14 00106 i014Jsan 14 00106 i015
Cr = 0.3325, BER = 0.2206 (audio.wav)Cr = 0.7822, BER = 0.2497 (audio.mp3)Cr = 0.9802, BER = 0.2935
Extracted text = “In TLd”NaMd/f Slhah,!tana}e kf$thE case8is!Esmaa Qbbelmnem yssa” thE sg”
(Cr = 0, BER = 0.0556)
Jsan 14 00106 i016Jsan 14 00106 i017Jsan 14 00106 i018
Cr = 0.3382, BER = 0.2096 (audio.wav)Cr = 0.9699, BER = 0.2509 (audio.mp3)Cr = 0.8826, BER = 0.2860
Extracted text (Cr = 0, BER = 0.0571)
Jsan 14 00106 i019Jsan 14 00106 i020Jsan 14 00106 i021
Cr = 0.2294, BER = 0.2206 (audio.mp3)Cr = 0.7798, BER = 0.2498 (audio.mp3)Cr = 0.9827, BER = 0.3061
Extracted text (Cr = 0, BER = 0.0651)
Table 5. Reconstructed secret multimedia data without FEC using Model 2 [SNR = 5 dB].
Table 5. Reconstructed secret multimedia data without FEC using Model 2 [SNR = 5 dB].
Jsan 14 00106 i022Jsan 14 00106 i023Jsan 14 00106 i024
Cr = 0.2809, BER = 0.3399(audio.wav)Cr = 0.4365, BER = 0.3790 (audio.wav)Cr = 0.8031, BER = 0.2524
Extracted text = “In `a(Nqme,_n a|l@hl The8oimb o`” (Cr = 0, BER = 0.1635)
Jsan 14 00106 i025Jsan 14 00106 i026Jsan 14 00106 i027
Cr = 0.3158, BER = 0.3238 (audio.wav)Cr = 0.3610, BER = 0.4864 (audio.mp3)Cr = 0.8223, BER = 0.3226
Extracted text = “In Dhu$NalA`kF!@,jah,Bt lil” (Cr = 0, BER = 0.1524)
Jsan 14 00106 i028Jsan 14 00106 i029Jsan 14 00106 i030
Cr = 0.2683, BER = 0.3601 (audio.wav)Cr = 0.3090, BER = 0.4835 (audio.mp3)Cr = 0.5579, BER = 0.3112
Extracted text (Cr = 0, BER = 0.1810)
Jsan 14 00106 i031Jsan 14 00106 i032Jsan 14 00106 i033
Cr = 0.7795, BER = 0.3046 (audio.mp3)Cr = 0.3706, BER = 0.4866 (audio.mp3)Cr = 0.8207, BER = 0.3335
Extracted text (Cr = 0), (BER = 0.1873)
Table 6. Reconstructed multimedia data with convolutional encoding using Model 1 [SNR = 5 dB].
Table 6. Reconstructed multimedia data with convolutional encoding using Model 1 [SNR = 5 dB].
Jsan 14 00106 i034Jsan 14 00106 i035Jsan 14 00106 i036
Cr = 0.9934, BER = 0.1752 (audio.wav)Cr = 0.9998, BER = 0.2413 (audio.wav)Cr = 0.9988, BER = 0.2049
Extracted text = “In The Name of Allah, the name of the case is Asmaa Abdelmonem Eyssa the age is 34 years” (Cr = 1, BER = 0)
Jsan 14 00106 i037Jsan 14 00106 i038Jsan 14 00106 i039
Cr = 0.9914, BER = 0.2173 (audio.wav)Cr = 0.9998, BER = 0.2468 (audio.mp3)Cr = 0.9978, BER = 0.2911
Extracted text (Cr = 1, BER = 0)
Jsan 14 00106 i040Jsan 14 00106 i041Jsan 14 00106 i042
Cr = 0.9945, BER = 0.2062 (audio.wav)Cr = 0.9818, BER = 0.2474 (audio.mp3)Cr = 0.9856, BER = 0.2835
Extracted text (Cr = 1, BER = 0)
Jsan 14 00106 i043Jsan 14 00106 i044Jsan 14 00106 i045
Cr = 0.9995, BER = 0.2171 (audio.mp3)Cr = 0.9998, BER = 0.2468 (audio.mp3)Cr = 0.9979, BER = 0.3038
Extracted text (Cr = 1, BER = 0)
Table 7. Reconstructed multimedia data with convolutional encoding using Model 2 [SNR = 5 dB].
Table 7. Reconstructed multimedia data with convolutional encoding using Model 2 [SNR = 5 dB].
Jsan 14 00106 i046Jsan 14 00106 i047Jsan 14 00106 i048
Cr = 0.9013, BER = 0.3380 (audio.wav)Cr = 0.9771, BER = 0.3601 (audio.wav)Cr = 0.9760, BER = 0.2095
Extracted text = “In The Name of Allah, the name of the case is Asmaa Abdelmonem Eyssa the age is 34 years” (Cr = 1, BER = 0)
Jsan 14 00106 i049Jsan 14 00106 i050Jsan 14 00106 i051
Cr = 0.9673, BER = 0.3217 (audio.wav)Cr = 0.7177, BER = 0.4843 (audio.mp3)Cr = 0.9961, BER = 0.2927
Extracted text (Cr = 1, BER = 0)
Jsan 14 00106 i052Jsan 14 00106 i053Jsan 14 00106 i054
Cr = 0.8532, BER = 0.3588 (audio.wav)Cr = 0.3538, BER = 0.4810 (audio.mp3)Cr = 0.9844, BER = 0.2836
Extracted text (Cr = 1, BER = 0)
Jsan 14 00106 i055Jsan 14 00106 i056Jsan 14 00106 i057
Cr = 0.9888, BER = 0.3024 (audio.mp3)Cr = 0.7178, BER = 0.4848 (audio.mp3)Cr = 0.9903, BER = 0.3061
Extracted text (Cr = 1, BER = 0)
Table 8. Extracted text from Model 1 over AWGN through OFDM (OFDM_ modulation).
Table 8. Extracted text from Model 1 over AWGN through OFDM (OFDM_ modulation).
SNR Extracted Text
0 dB
5 dB
10 dBInp
15 dBIn The Name.of0Allah, the name of the(cmse is Asmai Abdelmonem Eyssa0 the(age2is 34 yecrs
20 dBIn The Name!of0Allah, the name of the(cese is Asmaa Qbdelmonem Eyssa 0the(age2is 34 years
25 dBIn The Name of0Allah, the name of the(case is Asmaa Qbdelmonem Eyssa 0the(age2is 34 years
30 dBIn The Name of0Allah, the name of the(case is Asmaa Qbdelmonem Eyssa 0the(age2is 34 years
35 dBIn The Name of0Allah, the name of the(case is Asmaa Qbdelmonem Eyssa 0the(age2is 34 years
Table 9. Reconstructed multimedia data with convolutional encoding for Model 1 using (OFDM + DPSK) modulations over an AWGN channel [SNR = 5 dB].
Table 9. Reconstructed multimedia data with convolutional encoding for Model 1 using (OFDM + DPSK) modulations over an AWGN channel [SNR = 5 dB].
Jsan 14 00106 i058Jsan 14 00106 i059Jsan 14 00106 i060
Cr = 0.9160, BER = 0.1764 (audio.wav)Cr = 0.9925, BER = 0.4067 (audio.wav)Cr = 0.9925, BER = 0.3387
Extracted text (Cr = 0, BER = 0.481)
Jsan 14 00106 i061Jsan 14 00106 i062Jsan 14 00106 i063
Cr = 0.8966, BER = 0.2180 (audio.wav)Cr = 0.9935,BER = 0.2469 (audio.mp3)Cr = 0.9974, BER = 0.2911
Extracted text (Cr = 1, BER = 0)
Jsan 14 00106 i064Jsan 14 00106 i065Jsan 14 00106 i066
Cr = 0.9177, BER = 0.2068 (audio.wav)Cr = 0.9813, BER = 0.2469(audio.mp3)Cr = 0.9830, BER = 0.2835
Extracted text (Cr = 1, BER = 0)
Jsan 14 00106 i067Jsan 14 00106 i068Jsan 14 00106 i069
Cr = 0.9916, BER = 0.2211 (audio.mp3)Cr = 0.9926, BER = 0.2469 (audio.mp3)Cr = 0.9976, BER = 0.3038
Extracted text (Cr = 1, BER = 0)
Table 10. Reconstructed multimedia data with convolutional encoding for Model 2 using (OFDM + DPSK) modulation system over AWGN channel [SNR = 5 dB].
Table 10. Reconstructed multimedia data with convolutional encoding for Model 2 using (OFDM + DPSK) modulation system over AWGN channel [SNR = 5 dB].
Jsan 14 00106 i070Jsan 14 00106 i071Jsan 14 00106 i072
Cr = 0.6453, BER = 0.3376 (audio.wav)Cr = 0.9580, BER = 0.3604 (audio.wav)Cr = 0.9610, BER = 0.2121
Extracted text = “In Uhe*nume of Ullch,!the name of`vhe case`is Asmaa Abdelmonem Gyssa the age is 34 years” (Cr = 0, BER = 0.0222)
Jsan 14 00106 i073Jsan 14 00106 i074Jsan 14 00106 i075
Cr = 0.8158, BER = 0.3217 (audio.wav)Cr = 0.7086, BER = 0.4843 (audio.mp3)Cr = 0.9939, BER = 0.2930
Extracted text = “In The Name of Allah, the name of the case is Asmaa Abdelmonem Eyssa the age is `34 years” (Cr = 0, BER = 0.0016)
Jsan 14 00106 i076Jsan 14 00106 i077Jsan 14 00106 i078
Cr = 0.5486, BER = 0.3575 (audio.wav)Cr = 0.3531, BER = 0.4811 (audio.mp3)Cr = 0.9761, BER = 0.2839
Extracted text (Cr = 0, BER = 0.0032)
Jsan 14 00106 i079Jsan 14 00106 i080Jsan 14 00106 i081
Cr = 0.9679, BER = 0.3140 (audio.mp3)Cr = 0.7104, BER = 0.4848 (audio.mp3)Cr = 0.9881, BER = 0.3064
Extracted text (Cr = 0, BER = 0.0016)
Table 11. Extracted text from Model 1 over AWGN using (OFDM + DPSK).
Table 11. Extracted text from Model 1 over AWGN using (OFDM + DPSK).
SNR Extracted Text
0 dBRe(;ky7NPc3VmR?#FtNQ
5 dBIn!The Name of Allah, the name of the case is Asmaa Abdelmonem Eyssa the age is 34 years
10 dBIn The Name of0Allah, the name of the(case is Asmaa Qbdelmonem Eyssa 0the(age2is 34 years
15 dBIn The Name of0Allah, the name of the(case is Asmaa Qbdelmonem Eyssa 0the(age2is 34 years
20 dBIn The Name of0Allah, the name of the(case is Asmaa Qbdelmonem Eyssa 0the(age2is 34 years
25 dBIn The Name of0Allah, the name of the(case is Asmaa Qbdelmonem Eyssa 0the(age2is 34 years
30 dBIn The Name of0Allah, the name of the(case is Asmaa Qbdelmonem Eyssa 0the(age2is 34 years
35 dBIn The Name of0Allah, the name of the(case is Asmaa Qbdelmonem Eyssa 0the(age2is 34 years
Table 12. Extracted text from Model 1 over the Fading channel using (OFDM +DPSK).
Table 12. Extracted text from Model 1 over the Fading channel using (OFDM +DPSK).
SNR Extracted Text
0 dBgg = gK5L|kt,
5 dBIn The Name of Allah,0the name of0the case is Asmaa Abdelmonem Eyssa the(age is 34 years
10 dBIn The Name of0Allah, the name of the(case is Asmaa Qbdelmonem Eyssa 0the(age2is 34 years
15 dBIn The Name of0Allah, the name of the(case is Asmaa Qbdelmonem Eyssa 0the(age2is 34 years
20 dBIn The Name of0Allah, the name of the(case is Asmaa Qbdelmonem Eyssa 0the(age2is 34 years
25 dBIn The Name of0Allah, the name of the(case is Asmaa Qbdelmonem Eyssa 0the(age2is 34 years
30 dBIn The Name of0Allah, the name of the(case is Asmaa Qbdelmonem Eyssa 0the(age2is 34 years
35 dBIn The Name of0Allah, the name of the(case is Asmaa Qbdelmonem Eyssa 0the(age2is 34 years
Table 13. Reconstructed multimedia with convolutional encoding on OFDM system with (OFDM + DPSK) modulations over a Fading channel [SNR = 5 dB].
Table 13. Reconstructed multimedia with convolutional encoding on OFDM system with (OFDM + DPSK) modulations over a Fading channel [SNR = 5 dB].
Model 1
Jsan 14 00106 i082Jsan 14 00106 i083Jsan 14 00106 i084
Cr = 0.9522, BER = 0.1770 (udio.wav)
PSNR = 39.3251
Cr = 0.9984, BER = 0.2414 (audio.wav)
PSNR = 42.0124
Cr = 0.9830, BER = 0.2074
PSNR = 25.7678
Extracted text = “In!The Name of Qllah, the name of0the case is Asmaa Abdelmonem Eyssa the(age is 34 years” (Cr = 0, BER = 0.0063)
Jsan 14 00106 i085Jsan 14 00106 i086Jsan 14 00106 i087
Cr = 0.9924, BER = 0.2234 (audio.mp3)
PSNR = 35.7209
Cr = 0.9988, BER = 0.2468 (audio.mp3)
PSNR = 43.3394
Cr = 0.9977, BER = 0.3038
PSNR = 35.2872
Extracted text (Cr = 1, BER = 0)
Model2
Jsan 14 00106 i088Jsan 14 00106 i089Jsan 14 00106 i090
Cr = 0.7124, BER = 0.3379 (audio.wav)
PSNR = 31.6726
Cr = 0.9511, BER = 0.3604 (audio.mp3)
PSNR = 27.9513
Cr = 0.9697, BER = 0.2107
PSNR = 23.2955
Extracted text (Cr = 0, BER = 0.0143)
Jsan 14 00106 i091Jsan 14 00106 i092Jsan 14 00106 i093
Cr = 0.9716, BER = 0.3079 (audio.mp3)
PSNR = 31.4113
Cr = 0.7053, BER = 0.4848 (audio.mp3)
PSNR = 18.4474
Cr = 0.9876, BER = 0.3065
PSNR = 27.9792
Extracted text (Cr = 0, BER = 0.0048)
Table 14. Reconstructed multimedia from Model 1 with convolutional encoding on an OFDM system with (OFDM + DPSK) modulations over (AWGN- Fading –SUI6) channels [SNR = 6 dB].
Table 14. Reconstructed multimedia from Model 1 with convolutional encoding on an OFDM system with (OFDM + DPSK) modulations over (AWGN- Fading –SUI6) channels [SNR = 6 dB].
Model 1 over an AWGN Channel Through an OFDM System Using (OFDM + DPSK) Modulation
Jsan 14 00106 i094Jsan 14 00106 i095Jsan 14 00106 i096
Cr = 0.9893, BER = 0.2316 (audio.mp3)
PSNR = 34.2353
Cr = 0.9993, BER = 0.2468 (audio.mp3) PSNR = 44.9526Cr = 0.9979, BER = 0.3038
PSNR = 35.6745
Extracted text (Cr = 1, BER = 0)
Jsan 14 00106 i097Jsan 14 00106 i098Jsan 14 00106 i099
Cr = 0.9934, BER = 0.2235 (audio.mp3)
PSNR = 36.3781
Cr = 0.9989, BER = 0.2468 (audio.mp3) PSNR = 43.6093Cr = 0.9978, BER = 0.3038 PSNR = 35.6745
Extracted text (Cr = 1, BER = 0)
Model 1 over the SUI6 channel through the OFDM system using (OFDM + DPSK) modulation
Jsan 14 00106 i100Jsan 14 00106 i101Jsan 14 00106 i102
Cr = 0.9917, BER = 0.2509 (audio.mp3) PSNR = 35.3370Cr = 0.9997, BER = 0.2468 (audio.mp3) PSNR = 46.4769Cr = 0.9979, BER = 0.3038 PSNR = 35.6771
Extracted text (Cr = 1, BER = 0)
Table 15. Theoretical comparative security and attack resistance analysis of LSB, DWT, and hybrid LSB–DWT approaches.
Table 15. Theoretical comparative security and attack resistance analysis of LSB, DWT, and hybrid LSB–DWT approaches.
Attack TypeLSB DomainDWT DomainHybrid (LSB + DWT)Remarks
Statistical steganalysis (e.g., chi-square, RS analysis)Weak
(easily detectable due to direct pixel modification)
Moderate
(Coefficient changes are less obvious)
Strong
(spatial + frequency embedding reduces detectability)
Hybrid disperses changes across domains, lowering statistical bias.
Differential attack (frame/image differencing)Weak
(differences amplified at high LSB embedding)
Moderate
(frequency domain dampens minor variations)
Strong
(distribution across domains resists direct differencing)
Improved resilience by balancing modifications between LSB and DWT.
Noise attack (Gaussian, Salt and Pepper)Weak
(bit error is highly impactful)
Moderate
(frequency domain coefficients more robust)
Strong
(error correction + redundancy improve survival)
OFDM + convolutional coding aid robustness.
Compression (JPEG/MP3)Very weak
(lossy compression destroys embedded bits)
Moderate
(low-frequency subbands survive compression)
Strong
(DWT selection + error correction mitigate loss)
Hybrid embeds redundantly in robust bands and LSBs.
Cropping/Partial data lossWeak
(localized LSB loss destroys payload)
Moderate
(global transform gives partial recovery)
Strong
(redundant embedding across domains increases recovery rate)
Hybrid ensures payload survival even under partial cropping.
Cachin’s information-theoretic measure (security level)Low
(p-value deviates significantly)
Moderate
(closer to uniform distribution)
High
(embedding imperceptibility enhanced)
Hybrid meets stronger security criteria under Cachin’s framework.
Table 16. Comparative analysis of LSB, DWT, and hybrid LSB–DWT approaches.
Table 16. Comparative analysis of LSB, DWT, and hybrid LSB–DWT approaches.
CriterionLSB (Spatial Domain)DWT (Transform Domain)Hybrid LSB–DWT (Proposed)
Embedding capacityHigh (can embed more bits per pixel)Moderate (limited by transform coefficients)High–Moderate (capacity enhanced by LSB, controlled by DWT to preserve quality)
ImperceptibilityVery high (changes occur in insignificant pixel bits)High (modifications in the frequency domain are less visible)Very high (imperceptibility preserved via balanced embedding in both domains)
RobustnessLow (fragile against compression, filtering, scaling)High (robust against compression, noise, and filtering)High (DWT ensures robustness; LSB provides additional redundancy for improved error tolerance)
SecurityLow (easy to detect by statistical analysis)Higher (frequency-domain embedding is harder to detect)Higher (dual-domain embedding increases resistance against steganalysis)
Computational complexityLow: O(n) for embedding/extraction Higher :   O n log n for DWT decomposition Moderate :   O n log n , dominated by DWT, but with lightweight LSB stage, maintaining efficiency
Space complexityO(n) (only pixels)O(n) (requires storing transform coefficients)O(n) (hybrid requires storage of both pixel data and transform coefficients, still linear in input size)
Overall trade-offHigh capacity, low robustnessHigh robustness, moderate capacityBalanced capacity, imperceptibility, and robustness with manageable computational overhead
Table 17. Performance comparison.
Table 17. Performance comparison.
Model SNR (AWGN) Channel Reconstructed Data
TextSec_ImageSec_AudioStego_Audio
Ref. [17]10 dB----Cr = 0.841PSNR = 32.61
Model1----Cr = 0.9998PSNR = 48.9
Model2-- Cr = 0.9771PSNR = 37.32
Ref. [31]15 dB--Cr = 0.964----
Model1--Cr = 0.965----
Model2--Cr = 0.921----
Ref. [6]6 dB----PSNR = 36.10--
Model1----PSNR = 44.9526--
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Hamdi, A.A.; Eyssa, A.A.; Abdalla, M.I.; ElAffendi, M.; AlQahtani, A.A.S.; Ateya, A.A.; Elsayed, R.A. Improving Audio Steganography Transmission over Various Wireless Channels. J. Sens. Actuator Netw. 2025, 14, 106. https://doi.org/10.3390/jsan14060106

AMA Style

Hamdi AA, Eyssa AA, Abdalla MI, ElAffendi M, AlQahtani AAS, Ateya AA, Elsayed RA. Improving Audio Steganography Transmission over Various Wireless Channels. Journal of Sensor and Actuator Networks. 2025; 14(6):106. https://doi.org/10.3390/jsan14060106

Chicago/Turabian Style

Hamdi, Azhar A., Asmaa A. Eyssa, Mahmoud I. Abdalla, Mohammed ElAffendi, Ali Abdullah S. AlQahtani, Abdelhamied A. Ateya, and Rania A. Elsayed. 2025. "Improving Audio Steganography Transmission over Various Wireless Channels" Journal of Sensor and Actuator Networks 14, no. 6: 106. https://doi.org/10.3390/jsan14060106

APA Style

Hamdi, A. A., Eyssa, A. A., Abdalla, M. I., ElAffendi, M., AlQahtani, A. A. S., Ateya, A. A., & Elsayed, R. A. (2025). Improving Audio Steganography Transmission over Various Wireless Channels. Journal of Sensor and Actuator Networks, 14(6), 106. https://doi.org/10.3390/jsan14060106

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop