Next Article in Journal
Fatigue Analysis of 3D-Printed Materials for Temporary Reconstructions on Dental Implants—A Pilot Study
Previous Article in Journal
Static Foot Hyperpronation Monitoring in Asymptomatic Young Individuals During Level and Sloped Gait Using an Instrumented Treadmill
Previous Article in Special Issue
Cross-Technology Interference-Aware Rate Adaptation in Time-Triggered Wireless Local Area Networks
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Enhancement of Image Reconstruction in Orthogonal Frequency-Division Multiplexing (OFDM)-Based Communication System Using Conditional Diffusion Model of Generative AI

1
Department of Electronic Convergence Engineering, Kwangwoon University, Seoul 01897, Republic of Korea
2
Research and Development Department, SMARTEVER, Co., Ltd., Seoul 01886, Republic of Korea
*
Author to whom correspondence should be addressed.
Appl. Sci. 2025, 15(6), 3210; https://doi.org/10.3390/app15063210
Submission received: 10 February 2025 / Revised: 9 March 2025 / Accepted: 13 March 2025 / Published: 15 March 2025
(This article belongs to the Special Issue IoT and AI for Wireless Communications)

Abstract

:
The orthogonal frequency-division multiplexing (OFDM) transmission technique is well known to be efficient for data transmission but is susceptible to performance degradation due to factors such as high-order modulation schemes, multipath fading, and noise. In this paper, an approach for reconstructing images received by the OFDM transmission technique is proposed based on generative AI. The approach exploits a conditional diffusion model (CDM) that incorporates conditional factors reflecting the degree of distortion in the received images by the OFDM technique. Additionally, it employs a method to learn the variance in the reverse process during training, considering the level of distortion associated with various modulation schemes. Through this adaptability, the proposed model is experimentally demonstrated to optimize image reconstruction performance under various modulation schemes in low-SNR environments. The proposed conditional diffusion model can enhance the PSNR of OFDM-based received images by up to 8 dB in low-SNR conditions with various modulation schemes.

1. Introduction

In wireless communication systems, orthogonal frequency-division multiplexing (OFDM) plays a crucial role in enabling efficient data transmission over multipath channels [1]. OFDM is a multicarrier modulation scheme that divides a single data stream into several parallel sub-streams, each transmitted over its subcarrier. This approach utilizes bandwidth through multiple subcarriers, thereby significantly enhancing data transmission rates.
Despite the aforementioned advantages, OFDM signals can be vulnerable to channel impairments during transmission and signal degradation when high-order modulation schemes are utilized under low-signal-to-noise-ratio (SNR) conditions [2,3]. Key challenges include distortion from high-order modulation schemes, which increases the computational complexity of signal detection and recovery, as well as channel noise and multipath fading, which can severely affect the quality of the received signal [3]. As a result of these factors, degradation in signal fidelity can lead to substantial distortion of the received images, negatively affecting image quality and interpretability. In order to address these issues that degrade signal fidelity, the development of effective reconstruction techniques may be required to reconstruct the distorted image at the receiver side.
The advent of generative artificial intelligence (AI) models has led to significant innovations across various industries, particularly in natural language processing (NLP) [4], computer vision [5], and data synthesis [6]. These advancements have transformed the way that data are generated, processed, and utilized, paving the way for more efficient and sophisticated applications [7]. Generative AI models focus on creating new data instances that resemble the training data, often from unstructured or partially structured inputs. They enable a more comprehensive exploration of data distribution, ultimately enabling the generation of high-fidelity data samples [6,7].
The existing generative AI models employ various methods to generate data, with two prominent methods being generative adversarial networks (GANs) [8] and variational autoencoders (VAEs) [9]. A GAN consists of two neural networks, namely, a generator and a discriminator. The generator aims to create data that closely resemble real data, while the discriminator tries to distinguish between actual and generated data. Through this competitive framework, GANs can produce highly realistic images. However, training GANs can be unstable and may suffer from issues like mode collapse, where the generator fails to produce diverse outputs [10,11]. In contrast, VAEs operate by learning a latent space to generate new data. They compress the input data into latent variables and then reconstruct them to create new samples. While VAEs can effectively capture complex distributions, the quality of the generated data can be limited, sometimes lacking fine detail [12].
Recently, an alternative approach known as the diffusion model, introduced by Ho et al. [13], has garnered significant attention. Unlike GANs and VAEs, the diffusion model operates through a systematic process of progressively adding noise to the data. During the training phase, this noise addition process is learned in reverse to reconstruct the original data. One of the primary strengths of the diffusion model is its enhanced robustness in generating high-dimensional data, effectively mitigating the common mode collapse issue faced by GANs. Moreover, it is known to generate more realistic and higher quality samples than VAEs.
The processes and characteristics of diffusion models can be particularly advantageous in applications such as data reconstruction in communication systems, where maintaining a high-quality data representation is crucial. Thanks to their strengths in restoring data, there have been various studies on applying diffusion models to improve signal fidelity and data recovery processes in wireless communications [14,15,16,17]. In [14], it was demonstrated that a generative diffusion model can rapidly synthesize reliable wireless channel samples from limited data. The channel noise reduction diffusion model (CDDM) [15] has been used to effectively remove channel noise and improve the clarity and fidelity of the transmitted signal. The authors of [16] utilized a diffusion model for end-to-end channel coding. This model was integrated into the channel coding process to enhance data transmission and signal fidelity, thereby improving the robustness of communication systems against noise.
In this paper, a conditional diffusion model (CDM) is proposed for the reconstruction of OFDM-based images by leveraging the ability of diffusion models to denoise and restore images. To mitigate the complexity of OFDM-based image reconstruction, the proposed CDM incorporates a conditional factor that reflects the degree of distortion in the received signal. In contrast to existing methods in which a fixed variance (FV) is used to simplify the loss function during learning, the proposed model utilizes a learnable variance (LV). Through this process, an optimal variance can be selected based on the modulation scheme of OFDM signals, even under low-SNR conditions. The main contributions of this paper can be summarized as follows:
  • A CDM that addresses distorted OFDM-based images in wireless communication is introduced. While the existing diffusion models have only considered Gaussian noise in the diffusion process, the proposed model is designed to handle the image distortion that occurs in OFDM-based wireless communication. This adaptation allows the model to effectively account for varying levels of signal degradation, leading to improved reconstruction performance.
  • The proposed CDM dynamically learns the variance during the training process, in contrast to diffusion models that learn through a fixed variance. This flexibility enables the model to optimally adjust to different modulation schemes and diverse SNR conditions, enhancing the overall training effectiveness and the rapidity of image reconstruction.
  • A performance analysis is conducted on the proposed CDM under diverse modulation schemes and SNR conditions. From the simulation results, it is confirmed that the proposed CDM with learnable variance (CDM-LV) is effective in reconstructing OFDM-based images from received signals in highly deteriorated environments.
The remainder of this paper is organized as follows. Section 2 provides an overview of the OFDM technique and diffusion model. Section 3 details the proposed CDM, focusing on the integration of diffusion models in the context of OFDM-based image reconstruction. Section 4 presents the findings of the experiments conducted to evaluate the proposed approach. Finally, Section 5 summarizes the main findings and outlines potential directions for future research.

2. Problem Formulation

In wireless communication systems, the efficient transmission of high-quality images is one of the key applications. By utilizing subcarriers, OFDM is adopted to optimize the bandwidth and increase the data transmission rates. However, as the modulation order increases, it becomes more susceptible to inter-symbol interference and channel distortion, which can result in high error rates under challenging channel conditions. The proposed CDM is applied to reduce the error rate at high modulation orders, and its block diagram for OFDM-based image reconstruction is presented in Figure 1. Through the application of the proposed CDM, the distortion in the received image can be mitigated, producing high-quality image samples. This section provides an overview of the OFDM system and discusses the diffusion model.

2.1. OFDM-Based Communication Systems

The proposed system for OFDM-based image reconstruction is designed to operate within the conventional OFDM communication framework, where image data are transmitted as symbols modulated over multiple orthogonal subcarriers. This subsection provides an overview of the OFDM-based transmission process and the distortion of the received signal.

2.1.1. OFDM Systems for Image Transmission and Signal Processing

In the OFDM system for image transmission, initially, the pixel values of the image data are converted into a binary sequence, and then the binary stream is mapped into a complex-valued in-phase and quadrature (IQ) constellation plan according to modulation schemes such as M 1 -PSK ( M 1 = 2 ,   4 ,   8 ) and M 2 -QAM ( M 2 = 16 ,   32 ,   64 ) . The frequency domain OFDM symbol S ( k ) is transformed into the time domain OFDM symbol s ( n ) using an N D F T -point inverse discrete Fourier transform (IDFT) through parallel-to-serial (P/S) transmission. The IDFT equation is as follows:
s n = 1 N D F T k = 0 N D F T 1 S ( k ) exp j 2 π k n N D F T ,
where j is an imaginary unit, and N D F T is the discrete Fourier transform (DFT) length. The addition of a cyclic prefix (CP) is crucial, as it helps maintain the orthogonality of the subcarriers by preventing interference between them during transmission. The length of the CP is chosen to be larger than the expected delay spread. The OFDM symbol with a CP length of N C P is represented as follows:
s C P n =       s N D F T + n ,           N C P n 1 s n ,                                       o t h e r t w i s e .
The signal s C P t is transmitted to the receiver through a wireless channel. During transmission, the signal is affected by multipath fading, which results in the distortion of the received signal. The multipath channel in the wireless channel is represented as h t , and the received signal y C P t at the receiver is expressed as follows:
h t = i = 0 M 1 A i δ ( t τ i ) ,
y C P t = s C P t h t + w ( t ) ,
where M , A i ,   τ i , , and w ( t ) represent the number of multipath components, the attenuation coefficient, the time delay, the convolution operator, and additive white Gaussian noise (AWGN), respectively. At the receiver end, the received signal y C P n is processed through CP removal, DFT, parallel-to-serial conversion, and demodulation to obtain the received image y .

2.1.2. Distortion of the Received Signal

Despite the inherent advantages of OFDM, the received signal can be vulnerable to various forms of signal degradation caused by the channel conditions. Signal distortions resulting from additive noise, multipath fading, higher-order modulation schemes, etc., can significantly compromise transmission integrity [3]. High-order modulation schemes require higher levels of received signal-to-noise and interference ratio (SNIR) to correctly demodulate the transmission signals due to their sensitivity to noise, non-linearities, distortion, and interference. If not adequately addressed, these factors can lead to significant performance degradation, especially in OFDM-based image communication systems.
Common sources of distortion include environmental factors such as changes in the physical landscape, the presence of obstacles that obstruct the signal paths, and variations in the channel conditions over time. Multipath propagation, where signals arrive at the receiver through multiple paths with different delays, can cause inter-symbol interference (ISI), further complicating the recovery of the transmitted data. As a result, effective strategies to mitigate these distortions are crucial for maintaining the integrity and reliability of OFDM signals.
A comparative analysis of processed images across various modulation schemes and SNR levels for the CIFAR-10 image dataset is presented in Figure 2. The rows correspond to different modulation schemes, i.e., M 1 -PSK and M 2 -QAM. The columns display the original image alongside images with distorted received signals at SNR levels of 0 dB, 5 dB, 10 dB, and 15 dB. Notably, it was determined that in the same SNR condition, the distortion of the images increases as the complexity of the modulation scheme increases. Higher-order modulation techniques, such as 32QAM and 64QAM, respond more sensitively to noise, leading to a more pronounced degradation in image quality compared to lower modulations such as BPSK and QPSK. These results indicate the trade-off between modulation complexity and the ensuing distortion.

2.1.3. Diffusion Model for Image Reconstruction in OFDM

As illustrated in Figure 3, generative AI models have attracted considerable attention in the field of communication owing to their capacity to model nonlinear functions and adapt to challenging channel conditions [18,19]. The increased demand for high-dimensional data transmission necessitates the implementation of high-order modulation schemes. Nevertheless, in the context of image transmission utilizing OFDM, the presence of noise and channel impairments degrades the fidelity of the transmitted images. As the modulation order increases, distortions become more apparent, thereby impairing the quality of the transmitted images.
Generative models, including GANs and VAEs, were employed for image restoration. Nevertheless, these models often face difficulties when dealing with the complex distortions introduced by wireless channels. In order to tackle these issues, the CDM has emerged as a viable option for reconstructing distorted images in OFDM-based image communication. In the context of the CDM, the distortions resulting from multipath fading and interference within OFDM channels are addressed as a forward process. The diffusion model is designed to learn the reverse process by systematically eliminating noise [13].
During CDM training [13], noise is steadily inserted into the data until the signal itself is entirely distorted. Then, the model is trained to systematically eliminate the noise, thereby reconstructing a signal that closely approximates the original data. In contrast to VAEs, which operate under the assumption of solely Gaussian noise, CDMs adeptly manage intricate distortions, including multipath fading. Furthermore, by calibrating the model based on parameters such as SNR and modulation order, CDMs can more reliably adjust to diverse channel distortions, leading to enhanced image reconstruction.

2.2. Diffusion Probabilistic Model

The diffusion model comprises a diffusion process that perturbs the data with noise and a reverse process that transforms the noisy data generated by the diffusion process back into the original data. This framework is modeled by the Markov chain. The former is typically designed to transform the data distribution into a standard Gaussian, while the latter learns a transition kernel parameterized by a deep neural network to invert the former process. Given the distribution of ground-truth data x 0 ~ q ( x 0 ) , the diffusion process generates a sequence of random variables x 1 ,   x 2 ,   , x T with a transition kernel q x t x t 1 . By utilizing the chain rule of probabilities along with the Markov property, the joint distribution of x 1 ,   x 2 ,   , x T conditioned on x 0 can be represented as q x 1 , , x T x 0 , which can be expressed as follows:
q x 1 , , x T x 0 = t = 1 T q x t x t 1 .
In diffusion models, the transition kernel is explicitly designed to gradually transform the data distribution q ( x 0 ) into a tractable prior distribution. The typical design of the transition kernel involves Gaussian perturbation, which is given below:
q x t x t 1 = N ( x t ; 1 β t x t 1 , β t Ι ) ,
where { β t 0,1 } t = 1 T is a variance schedule that controls the step size of the diffusion process. Due to the diffusion process, the data sample x 0 gradually loses its characteristics as t increases, and ultimately, as T goes to infinity, x T converges to an isotropic Gaussian distribution.
As mentioned in [13], by iteratively applying the reparameterization trick, it is possible to sample any step of noise conditioned directly on the input x 0 through the diffusion process defined in Equation (6), which is defined as follows:
q x t x 0 = N ( x t ; α ¯ t x 0 , ( 1 α ¯ t ) Ι ) ,
where α t = 1 β t , and α ¯ t = i = 1 t α i . According to Equation (7), given x 0 , the sample x t can be easily obtained by sampling a Gaussian vector ϵ ~ N ( 0 , Ι ) :
x t = α ¯ t x 0 + 1 α ¯ t ϵ .
In order to generate true samples from Gaussian noise x T , the diffusion model first generates an unstructured noise vector from a prior distribution that is generally challenging to handle and then executes a learned reverse process to gradually remove the noise. Specifically, the reverse process is parameterized by the prior distribution p x T = N ( x T ; 0 , Ι ) and a learnable transition kernel p θ x t 1 x t . This learnable transition kernel takes the following form:
p θ x t 1 x t = N ( x t 1 ; μ θ ( x t , t ) , β ~ t ) ,
μ θ = 1 α t ( x t 1 α t 1 α ¯ t ϵ θ ( x t , t ) ) ,
β ~ t = 1 α ¯ t 1 1 α ¯ t β t ,
where θ represents the model parameter, the mean μ θ being parameterized by a deep neural network such as U-Net, β ~ t is the variance of the reverse process, and ϵ θ x t , t denotes the estimated Gaussian noise in x t . This reverse process allows us to generate the data sample x ^ 0 by first sampling the noise vector x T ~ p x T . Subsequently, the learnable transition kernel p θ x t 1 x t is repeatedly sampled until t = 1 to generate the data sample x ^ 0 . .
The loss function for neural network training is derived through the variational lower bound of q x 1 , , x T x 0 and p θ x 1 , , x T x 0 as follows [20]:
L V L B = L 0 + L 1 + + L T 1 + L T ,
L 0 = log p θ x 0 x 1 ,
t = 1 T 1 L t = D K L ( q x t 1 x t ,   x 0 p θ x t 1 x t ) ,
L T = D K L q x T x 0 p x T ,
where L 0 and L T are computed in closed form, while L t is learned to minize the diffenrence between the means μ t = 1 α t ( x t 1 α t 1 α ¯ t ϵ t ( x t , t ) ) of q x t 1 x t ,   x 0 and μ θ of p θ x t 1 x t . According to [13], L t is simplified as follows:
L s i m p l e = E t , x 0 , ϵ ϵ ϵ θ ( x t , t ) 2 .

3. Conditional Diffusion Model for OFDM-Based Image Reconstruction

In this section, the proposed CDM for the reconstruction of distorted images in OFDM-based wireless communications is introduced. As the modulation schemes increase in complexity, the received signals become more sensitive to channel conditions and noise. The proposed CDM was designed to integrate factors that contribute to the distortion of the received signals, based on the ideas presented with regard to the conditional diffusion model [21]. These factors include the modulation schemes, the SNR conditions, and multipath fading. Furthermore, both the diffusion process and the reverse process were designed to find the optimal variance for each transmission environment. OFDM-based image reconstruction using the proposed CDM is depicted in Figure 4. The CDM utilizes an interpolation parameter, v t , to design a diffusion process to learn the distortions that occur in wireless communications.
The CDM is trained using pairs of x 0 and y from a training dataset. This training step enables the CDM to learn the relationship between the original and the distorted images. After training, the CDM is used for reconstruction. A distorted image y , which could be any arbitrarily distorted image received through OFDM transmission, is input into the trained CDM.

3.1. Conditional Diffusion and Reverse Processes

In order to reconstruct OFDM-based images, the proposed CDM modifies the mean of the Gaussian noise addition step from the existing diffusion process defined in Equation (7) by introducing the interpolation parameter v t . It was assumed that v t starts with v 0   0 at t = 0 and increases by gradual steps to reach v T 1 at t = T , so that the average of x t changes from the original image data x 0 to the distorted OFDM-based image y , as shown in Figure 4. This parameter integrates both the original image data x 0 and the distorted received signal y into the conditional diffusion process as follows:
q x t x 0 , y = N ( x t ; 1 v t α ¯ t x 0 + v t α ¯ t y ,     σ t Ι ) ,
where σ t denotes the variance of the conditional diffusion process, which is set in such a manner as to generalize the variance of the conventional diffusion process as follows:
σ t = 1 α ¯ t v t 2 α ¯ t .
The probability distribution of q x t x t 1 , y can be derived from the relationship between x t and x t 1 , after defining q x t x 0 , y and q x t 1 x 0 , y based on Equation (17) as follows:
q x t x t 1 , y = N x t ; 1 v t 1 v t 1 α t x t 1 + v t 1 v t 1 v t 1 v t 1 α ¯ t y ,   σ t | t 1 Ι ,
σ t | t 1 = σ t 1 v t 1 v t 1 2 α t σ t 1 .
In the reverse process of the CDM, the conditional denoising process is carried out based on the distorted images resulting from the conditional diffusion process. Based on Equations (17) and (19), q x t 1 x t , x 0 , y can be derived through Bayes’ theorem and the Markov chain property as follows [21]:
q x t 1 x t , x 0 , y = N ( x t 1 ; 1 v t 1 v t 1 σ t 1 σ t α t x t + 1 v t α ¯ t x 0 + ( v t 1 σ t v t 1 v t 1 v t 1 α t σ t 1 ) α ¯ t 1 σ t y , σ ~ t Ι ) ,
σ ~ t = σ t | t 1 σ t 1 σ t .
In order to predict x t 1 from x t and y in the conditional reverse process, it is necessary to employ a probability distribution such as q x t 1 x t , y , which can be defined as:
p θ x t 1 x t , y = N ( x t 1 ; μ θ ( x t , y , t ) ,   σ ~ t ) ,
μ θ x t , y , t = c x x t + c y y   c ϵ ϵ θ x t , y , t ,
where μ θ ( x t , y , t ) represents the mean of the probability distribution used to predict x t 1 during the reverse process, and ϵ θ x t , y , t denotes the estimated noise vector at time step t, conditioned on the current noisy data x t and the distorted image y . The coefficients c x , c y , and c ϵ of μ θ ( x t , y , t ) can be derived by combining Equations (17) and (21) as follows:
c x = 1 v t 1 v t 1 σ t 1 σ t α t + ( 1 v t 1 ) σ t | t 1 σ t 1 α t ,
c y = ( v t 1 σ t v t 1 v t 1 v t 1 α t v t 1 ) α ¯ t 1 σ t ,
c ϵ = 1 v t 1 σ t | t 1 σ t 1 α ¯ t α t .

3.2. Learnable Variance in the Conditional Diffusion Model

According to the authors of [13], when designing the loss function in diffusion models, it is preferable to fix Σ θ ( x t , t ) as β t or β t ~ to derive L s i m p l e , rather than treating it as a learnable variance (LV), as this approach demonstrates better sampling performance. However, the sampling performance seems to be affected by L V L B depending on the adjustment of Σ θ ( x t , t ) when the diffusion time step is small [22].
In wireless communications, it is important to accurately reconstruct distorted signals, but fast data processing is also critical. Therefore, the proposed CDM with LV (CDM-LV) optimizes the selection of variance based on the modulation scheme and SNR conditions, while also adjusting Σ θ x t , y , t through learning to achieve high-quality sampling with small diffusion time steps. In the proposed CDM, the learnable Σ θ x t , y , t was considered in the design of the loss function L C D M , which is given by the following equations:
Σ θ x t , y , t = e x p ( γ log σ t + 1 γ log σ ~ t ) .
L C D M = L C D M _ s i m p l e + λ L C D M _ V L B ,
L C D M _ s i m p l e = v t α ¯ t 1 α ¯ t y x 0 + σ t 1 α ¯ t ϵ ϵ θ ( x t , y , t ) 2 ,
L C D M _ V L B = t = 2 T D K L ( q x t 1 x t , x 0 , y | | p θ x t 1 x t , y ,
where λ denotes the weight of L C D M _ V L B . A summary of the training and sampling processes is presented in Algorithm 1 and Algorithm 2, respectively.
Algorithm 1: Training algorithm in CDM-LV
1for  i = 1 ,   2 ,   , N i t e r  do
2 Sample ( x 0 ,   y )   ~   q d a t a ,   ϵ   ~   N ( 0 , Ι )
3 t   ~ U n i f o r m ( 1 , , T )
4 x t = 1 v t α ¯ t x 0 + v t α ¯ t y + σ t ϵ
5 Compute θ ( x t , y , t )   = e x p ( γ log σ t + 1 γ log σ ~ t )
6 Compute L C D M = L C D M _ s i m p l e + λ L C D M _ V L B
7end for
Algorithm 2: Sampling algorithm in the CDM-LV
1Sample x T   ~   N ( x T , α ¯ T y ,   σ T Ι )
2for t = T ,   T 1 ,   ,   1  do
3 Compute c x , c y , c ϵ using Equations (25)–(27)
4 Sample x t 1 ~   p θ x t 1 x t , y = N ( x t 1 ; μ θ ( x t , y , t ) , Σ θ ( x t , y , t ) )
5end for
6return  x 0

3.3. Performance Metrics

In this paper, a peak signal-to-noise ratio (PSNR) and structural similarity index measure (SSIM) were used to evaluate the performance of the OFDM-based image reconstruction. The PSNR is a widely used metric for quantifying the quality of reconstructed images compared to their original versions. It measures the ratio between the maximum possible power of a signal and the power of corrupting noise that affects its representation. The PSNR is expressed in dB and can be calculated using the following formula:
P S N R = 10 log M A X I 2 M S E ,
where M A X I is the maximum possible pixel value, and M S E is the mean squared error between the original and the reconstructed images. A higher PSNR value indicates a better quality of reconstruction. The SSIM is a more sophisticated metric that assesses the visual quality of images based on structural information. Unlike the PSNR, the SSIM considers changes in texture, contrast, and structural patterns. The SSIM index ranges from 0 to 1, where 1 indicates perfect structural similarity. It is defined as:
S S I M x 0 ,   x ^ 0 = ( 2 φ x 0 φ   x ^ 0 + C 1 ) ( 2 ψ x x ^ 0 + C 2 ) ( φ x 0 2 + φ x ^ 2 + C 1 ) ( ψ x 0 2 + ψ x ^ 0 2 + C 2 ) ,
where φ , ψ 2 , and ψ x x ^ denote the average pixel value, the variance, and the covariance, respectively. The constants C 1 and C 2 are small values added to stabilize the division.

4. Simulation Results

4.1. Simulation Setup

To simulate the modulation of image data into OFDM signals, MATLAB R2024b was employed. In the simulation, various modulation schemes of OFDM communication, including M 1 -PSK ( M 1 = 2 ,   4 ,   8 ) and M 2 -QAM ( M 2 = 16 ,   32 ,   64 ) , were utilized to validate the proposed CDM. The simulations were conducted in an SNR range from 0 dB to 15 dB. A detailed configuration of the dataset is presented in Table 1.
In the training of the CDM, a U-Net with an input size of 32 × 32 × 3 was utilized for the learning process of the reverse process. The hyperparameter β t of the variance schedule was set from 10 4 to 0.035. The interpolation parameter v t was set to v t = ( 1 α ¯ t ) / α ¯ t . The training was conducted with 50 diffusion steps, a learning rate of 10 4 , and a dropout rate of 0.1. In Equation (28), the weight λ for L V L B in the learning of Σ θ x t , y , t was set to 10 3 to prevent L V L B from overshadowing L s i m p l e . A summary of the simulation parameters is shown in Table 2. The simulation was run on a desktop with AMD PRO 5975WX, with memory of 32 GB DDR4, and NVIDIA GeForce RTX 4090 GPU.

4.2. Performance Evaluation

As presented in Figure 5, the performance of the proposed conditional diffusion model for OFDM-based image reconstruction was demonstrated across a range of modulation schemes.
In Figure 5a–c, the PSNR for the M 1 -PSK schemes is presented for the varying SNR conditions. In low-SNR conditions, specifically ranging from 0 dB to 5 dB, the poor quality of the received images led to lower PSNR values. In this low-SNR condition, it was observed that the CDM model improved the PSNR with all M 1 -PSK schemes, demonstrating its effectiveness in reconstructing the quality of distorted OFDM-based images through increasing the PSNR by up to 5 dB at an SNR of 0 dB and to 8 dB at an SNR of 5 dB. However, in relatively higher SNR conditions of more than 15 dB, the received images in the BPSK and QPSK scenarios exhibited higher PSNR values compared to the images reconstructed by the proposed CDM. This observation can be due to the fact that the images received with BPSK and QPSK at SNR 15 dB were almost distortion-free, and the denoising process of the CDM unintentionally degraded the image quality of the received distortion-free images. Furthermore, it was observed that the conventional diffusion model resulted in lower PSNR values than the CDM under low-SNR conditions in BPSK and QPSK, and in higher PSNR values than the CDM under high-SNR conditions. This confirmed that the conditional diffusion model, which reflects the distorted signal, is effective in recovering distorted data in low-SNR environments where signal modulation is highly distorted. On the other hand, under high-SNR conditions, the conventional diffusion model performs better because it is less distorted by modulation. For learnable and fixed variances in the conditional diffusion model, it can be noted that the CDM-LV resulted in an improvement of approximately 3 dB to 4 dB in the PSNR. This result highlights the effectiveness of learnable variance in the context of OFDM-based image reconstruction.
In Figure 5d–f, the PSNR for the M 2 -QAM schemes is presented for a varying SNR. The proposed diffusion model demonstrated consistent improvements in PSNR performance through the overall SNR range, indicating its effectiveness in reconstructing the distorted OFDM-based image under M 2 -QAM schemes. Furthermore, it was observed that the CDM-LV approach improved the PSNR by around 4 dB at an SNR of 0 dB and 5 dB compared to the CDM-FV approach.
In Figure 6, the results of the SSIM are presented for a varying SNR. It was observed that the CDM particularly enhanced the SSIM in the received data, especially in low-SNR conditions. It can also be noted that the case with the CDM-LV approach achieved an improvement higher than 0.1 at an SNR of 0 dB and 5 dB compared to that with the CDM-FV approach. This improvement implies that the proposed model can effectively preserve the structural quality of images affected by distortion, resulting in a closer alignment with the original data under low-SNR conditions.
In Figure 7, the variances corresponding to the time steps are presented for the conventional diffusion process ( σ t ), reverse process ( σ ~ t .), and LV ( Σ θ x t , y , t ). In the CDM for OFDM-based image restoration, the initial phase of the diffusion process adheres to a fixed variance of σ t . However, with an increasing number of diffusion steps, the LV tends to adopt σ ~ t . It was confirmed that a flexible variance achieved through LV can give robustness to OFDM-based image reconstruction compared to a fixed variance.
Furthermore, the proposed CDM model demonstrated a processing speed of 50 images reconstructed per second, regardless of the modulation schemes, when using the CIFAR-10 dataset. This reconstructed rate indicates the model’s practical applicability in real-time scenarios with present-day computational capability, making it suitable for dynamic environments where timely image restoration is essential.
In Figure 8, the visual representations of the reconstructed images are illustrated over different modulation schemes. In the case of an SNR of 0 dB across all modulation schemes, significant distortion was confirmed. It can be visually confirmed that the images were greatly reconstructed through the proposed CDM compared to the received OFDM-based images. Overall, these results demonstrate that CDM is highly effective in ensuring image fidelity, particularly in environments characterized by varying levels of distortion.

5. Conclusions

In this paper, a CDM was designed to improve OFDM-based image reconstruction in wireless communication systems. The proposed scheme addressed the limitations of traditional OFDM-based image reconstruction by leveraging the robustness of diffusion models and incorporating conditional factors that adapt to varying levels of OFDM-based image distortion. A variance learning mechanism can further enhance the reconstruction performance across various modulation schemes, with the simulation results demonstrating significant improvements, particularly in low-SNR scenarios.
The efficiency of the proposed CDM is expected to be particularly useful under the low-SNR conditions of LEO satellite communications, suggesting that it can be widely applied to autonomous systems for estimating communication parameters and reconstructing distorted images. However, challenges remain in addressing real-world LEO complexities such as the variable Doppler shift, atmospheric interference, and dynamic link conditions. Future research will focus on mitigating these issues and further optimizing the CDM for different communication scenarios.

Author Contributions

Conceptualization, S.K., J.K. (Jinwook Kim), and Y.S.; methodology, S.K., and J.K. (Jinwook Kim); simulation, S.K. and J.K. (Jinwook Kim); formal analysis, S.K., S.L., and J.S.; resources, S.K., B.H., and Y.S.; writing—original draft preparation, S.K., J.K. (Jinwook Kim), J.K. (Jeongho Kim); writing—review and editing, J.S., K.K., and Y.S.; visualization, S.K., J.K. (Jinwook Kim) and S.L.; supervision, Y.S. and J.K. (Jinyoung Kim); project administration, S.K. and Y.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available in [CIFAR-10] at https://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf (accessed on 12 December 2024).

Acknowledgments

This work was partly supported by an Institute of Information and Communications Technology Planning and Evaluation (IITP) grant funded by the Ministry of Science and ICT (MSIT) (No. 2021-0-00892-005, “Research on Advanced Physical-Layer Technologies of Low-Earth Orbit (LEO) Satellite Communication Systems for Ubiquitous Intelligence in Space”) and supported by the MSIT, Korea, under the ITRC (Information Technology Research Center) support program (IITP-2025-RS-2023-00258639) supervised by the IITP.

Conflicts of Interest

Author Youngghyu Sun was employed by the company SMARTEVER, Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial of financial relationships that could be construed as a potential conflict of interest.

References

  1. Li, Y.; Cimini, L.J.; Sollenberger, N.R. Robust channel estimation for OFDM systems with rapid dispersive fading channels. IEEE Trans. Commun. 1998, 46, 902–915. [Google Scholar] [CrossRef]
  2. Wulich, D. Definition of efficient PAPR in OFDM. IEEE Commun. Lett. 2005, 9, 832–834. [Google Scholar] [CrossRef]
  3. Otsuka, H.; Tian, R.; Senda, K. Transmission performance of an OFDM-based higher-order modulation scheme in multipath fading channels. J. Sens. Actuator Netw. 2019, 8, 19. [Google Scholar] [CrossRef]
  4. Hagos, D.H.; Battle, R.; Rawat, D.B. Recent advances in generative AI and large language models: Current status, challenges, and perspectives. IEEE Trans. Artif. Intell. 2024, 5, 5873–5893. [Google Scholar] [CrossRef]
  5. Simion, A.-M.; Radu, Ș.; Florea, A.M. A review of generative adversarial networks for computer vision tasks. Electronics 2024, 13, 713. [Google Scholar] [CrossRef]
  6. Goyal, M.; Mahmoud, Q.H. A systematic review of synthetic data generation techniques using generative AI. Electronics 2024, 13, 3509. [Google Scholar] [CrossRef]
  7. Bengesi, S.; El-Sayed, H.; Sarker, M.K.; Houkpati, Y.; Irungu, J.; Oladunni, T. Advancements in generative AI: A comprehensive review of GANs, GPT, autoencoders, diffusion model, and transformers. IEEE Access 2024, 12, 69812–69837. [Google Scholar] [CrossRef]
  8. Goodfellow, I.J.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial networks. In Proceedings of the Advances in Neural Information Processing Systems 27 (NIPS 2014), Montreal, QC, Canada, 8–13 December 2014; pp. 2672–2680. [Google Scholar]
  9. Kingma, D.P.; Welling, M. An introduction to variational autoencoders. FNT Mach. Learn. 2019, 12, 307–392. [Google Scholar] [CrossRef]
  10. Kossale, Y.; Airaj, M.; Darouichi, A. Mode collapse in generative adversarial networks: An overview. In Proceedings of the 2022 8th International Conference on Optimization and Applications (ICOA), Genoa, Italy, 6–7 October 2022; pp. 1–6. [Google Scholar]
  11. Thanh-Tung, H.; Tran, T. Catastrophic forgetting and mode collapse in GANs. In Proceedings of the 2020 International Joint Conference on Neural Networks (IJCNN), Glasgow, UK, 19–24 July 2020; pp. 1–10. [Google Scholar]
  12. Daunhawer, I.; Sutter, T.M.; Chin-Cheong, K.; Palumbo, E.; Vogt, J.E. On the limitations of multimodal VAEs 2022. arXiv 2022, arXiv:2110.04121. [Google Scholar]
  13. Ho, J.; Jain, A.; Abbeel, P. Denoising diffusion probabilistic models. In Proceedings of the Advances in Neural Information Processing Systems 33 (NeurIPS 2020), Virtual, 6–12 December 2020; Volume 33, pp. 6840–6851. [Google Scholar]
  14. Sengupta, U.; Jao, C.; Bernacchia, A.; Vakili, S.; Shiu, D. Generative diffusion models for radio wireless channel modelling and sampling. In Proceedings of the GLOBECOM 2023—2023 IEEE Global Communications Conference, Kuala Lumpur, Malaysia, 4–8 February 2023; pp. 4779–4784. [Google Scholar]
  15. Wu, T.; Chen, Z.; He, D.; Qian, L.; Xu, Y.; Tao, M.; Zhang, W. CDDM: Channel denoising diffusion models for wireless semantic communications. IEEE Trans. Wirel. Commum. 2024, 23, 11168–11183. [Google Scholar] [CrossRef]
  16. Kim, M.; Fritschek, R.; Schaefer, R.F. Learning end-to-end channel coding with diffusion models. In Proceedings of the WSA & SCC 2023; 26th International ITG Workshop on Smart Antennas and 13th Conference on Systems, Communications, and Coding, Braunschweig, Germany, 27 February 2023; pp. 1–6. [Google Scholar]
  17. Letafati, M.; Ali, S.; Latva-Aho, M. Conditional denoising diffusion probabilistic models for data reconstruction enhancement in wireless communications. IEEE Trans. Mach. Learn. Commun. Netw. 2025, 3, 133–146. [Google Scholar] [CrossRef]
  18. Merluzzi, M.; Borsos, T.; Rajatheva, N.; Benczúr, A.A.; Farhadi, H.; Yassine, T.; Müeck, M.D.; Barmpounakis, S.; Strinati, E.C.; Dampahalage, D.; et al. The hexa-X project vision on artificial intelligence and machine learning-driven communication and computation co-design for 6G. IEEE Access 2023, 11, 65620–65648. [Google Scholar] [CrossRef]
  19. Van Huynh, N.; Wang, J.; Du, H.; Hoang, D.T.; Niyato, D.; Nguyen, D.N.; Kim, D.I.; Letaief, K.B. Generative AI for physical layer communications: A survey. IEEE Trans. Cogn. Commun. Netw. 2024, 10, 706–728. [Google Scholar] [CrossRef]
  20. Sohl-Dickstein, J.; Weiss, E.; Maheswaranathan, N.; Ganguli, S. Deep unsupervised learning using nonequilibrium thermodynamics. In Proceedings of the 32nd International Conference on Machine Learning, PMLR, Lille, France, 6–11 July 2015; pp. 2256–2265. [Google Scholar]
  21. Lu, Y.-J.; Wang, Z.-Q.; Watanabe, S.; Richard, A.; Yu, C.; Tsao, Y. Conditional diffusion probabilistic model for speech enhancement. In Proceedings of the ICASSP 2022—2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, 23–27 May 2022; pp. 7402–7406. [Google Scholar]
  22. Nichol, A.Q.; Dhariwal, P. Improved denoising diffusion probabilistic models. In Proceedings of the 38th International Conference on Machine Learning (ICML), Virtual, 18–24 July 2021; pp. 8162–8171. [Google Scholar]
Figure 1. Scheme for OFDM-based image reconstruction in wireless communication utilizing the CDM.
Figure 1. Scheme for OFDM-based image reconstruction in wireless communication utilizing the CDM.
Applsci 15 03210 g001
Figure 2. Comparison of received signal y under the modulation schemes M 1 -PSK and M 2 -QAM at different SNR levels (0 dB to 15 dB) for CIFAR-10 images.
Figure 2. Comparison of received signal y under the modulation schemes M 1 -PSK and M 2 -QAM at different SNR levels (0 dB to 15 dB) for CIFAR-10 images.
Applsci 15 03210 g002
Figure 3. Overview of wireless communication based on generative AI.
Figure 3. Overview of wireless communication based on generative AI.
Applsci 15 03210 g003
Figure 4. Conditional diffusion and reverse processes for OFDM-based image reconstruction.
Figure 4. Conditional diffusion and reverse processes for OFDM-based image reconstruction.
Applsci 15 03210 g004
Figure 5. PSNR performance of the conditional diffusion model across different modulation schemes. CDM-LV: conditional diffusion model with learnable variance, CDM-FV: conditional diffusion with fixed variance, DDPM: denoising diffusion probabilistic model, Rx Data: distorted image data at the receiver.
Figure 5. PSNR performance of the conditional diffusion model across different modulation schemes. CDM-LV: conditional diffusion model with learnable variance, CDM-FV: conditional diffusion with fixed variance, DDPM: denoising diffusion probabilistic model, Rx Data: distorted image data at the receiver.
Applsci 15 03210 g005
Figure 6. SSIM performance of the conditional diffusion model across different modulation schemes. CDM-LV: conditional diffusion model with learnable variance, CDM-FV: conditional diffusion with fixed variance, DDPM: denoising diffusion probabilistic model, Rx Data: distorted image data at the receiver.
Figure 6. SSIM performance of the conditional diffusion model across different modulation schemes. CDM-LV: conditional diffusion model with learnable variance, CDM-FV: conditional diffusion with fixed variance, DDPM: denoising diffusion probabilistic model, Rx Data: distorted image data at the receiver.
Applsci 15 03210 g006
Figure 7. Change in learnable variance with respect to the number of diffusion time steps. LV ( Σ θ x t , y , t ): learnable variance, σ ~ t : variance of the conditional reverse process, σ t : variance of the conventional diffusion process.
Figure 7. Change in learnable variance with respect to the number of diffusion time steps. LV ( Σ θ x t , y , t ): learnable variance, σ ~ t : variance of the conditional reverse process, σ t : variance of the conventional diffusion process.
Applsci 15 03210 g007
Figure 8. Comparison of reconstructed images for various modulation schemes under different SNR levels (0 dB to 15 dB). Each subfigure corresponds to a specific modulation scheme: (a) BPSK, (b) QPSK, (c) 8PSK, (d) 16QAM, (e) 32QAM, and (f) 64QAM. The first row of images represents the original images x 0 , the second row depicts the received images y at the receiver, and the third row illustrates the reconstructed images x ^ 0 resulting from the proposed scheme.
Figure 8. Comparison of reconstructed images for various modulation schemes under different SNR levels (0 dB to 15 dB). Each subfigure corresponds to a specific modulation scheme: (a) BPSK, (b) QPSK, (c) 8PSK, (d) 16QAM, (e) 32QAM, and (f) 64QAM. The first row of images represents the original images x 0 , the second row depicts the received images y at the receiver, and the third row illustrates the reconstructed images x ^ 0 resulting from the proposed scheme.
Applsci 15 03210 g008
Table 1. Experimental setup for generating an OFDM-based received dataset.
Table 1. Experimental setup for generating an OFDM-based received dataset.
Experimental ParametersValues
Image datasetCIFAR-10 ( 32 × 32 × 3 )
Training/test dataset50,000/10,000 images
Signal point ( N D F T )256
Modulation scheme M 1 -PSK ( M 1 = 2 ,   4 ,   8 ) & M 2 -QAM ( M 2 = 16 ,   32 ,   64 )
SNR range0 dB–15 dB
Multipath taps ( M )4
Channel estimationLeast square (LS)
Table 2. Hyperparameters with model architecture for training the proposed CDM-LV.
Table 2. Hyperparameters with model architecture for training the proposed CDM-LV.
HyperparameterValues
Neural architectureU-Net
[Channel dim: 128, depth multiplier: 1, 2, 3, 4, ResNet block: 2]
Variance schedule ( β t ) [ 10 4 , 0.035]
Time step ( T ) 50
Batch size64
Learning rate 10 4
Dropout0.1
L V L B Weight ( λ ) 10 3
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Kim, S.; Kim, J.; Sun, Y.; Seon, J.; Lee, S.; Hwang, B.; Kim, J.; Kim, K.; Kim, J. Enhancement of Image Reconstruction in Orthogonal Frequency-Division Multiplexing (OFDM)-Based Communication System Using Conditional Diffusion Model of Generative AI. Appl. Sci. 2025, 15, 3210. https://doi.org/10.3390/app15063210

AMA Style

Kim S, Kim J, Sun Y, Seon J, Lee S, Hwang B, Kim J, Kim K, Kim J. Enhancement of Image Reconstruction in Orthogonal Frequency-Division Multiplexing (OFDM)-Based Communication System Using Conditional Diffusion Model of Generative AI. Applied Sciences. 2025; 15(6):3210. https://doi.org/10.3390/app15063210

Chicago/Turabian Style

Kim, Soohyun, Jinwook Kim, Youngghyu Sun, Joonho Seon, Seongwoo Lee, Byungsun Hwang, Jeongho Kim, Kyounghun Kim, and Jinyoung Kim. 2025. "Enhancement of Image Reconstruction in Orthogonal Frequency-Division Multiplexing (OFDM)-Based Communication System Using Conditional Diffusion Model of Generative AI" Applied Sciences 15, no. 6: 3210. https://doi.org/10.3390/app15063210

APA Style

Kim, S., Kim, J., Sun, Y., Seon, J., Lee, S., Hwang, B., Kim, J., Kim, K., & Kim, J. (2025). Enhancement of Image Reconstruction in Orthogonal Frequency-Division Multiplexing (OFDM)-Based Communication System Using Conditional Diffusion Model of Generative AI. Applied Sciences, 15(6), 3210. https://doi.org/10.3390/app15063210

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop