Channel Estimation Using Linear Regression with Bernoulli–Gaussian Noise

Chaudhary, Prerna; Manoj, B. R.; Chauhan, Isha; Bhatnagar, Manav

doi:10.3390/app151910590

Open AccessArticle

Channel Estimation Using Linear Regression with Bernoulli–Gaussian Noise

¹

Department of Electrical Engineering, Indian Institute of Technology Delhi, Hauz Khas, New Delhi 110016, India

²

Department of Electronics and Electrical Engineering, India Institute of Technology, Guwahati, Assam 781039, India

³

Department of Electronics and Telecommunication, Politecnico di Torino, 10129 Turin, Italy

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(19), 10590; https://doi.org/10.3390/app151910590

Submission received: 14 August 2025 / Revised: 15 September 2025 / Accepted: 19 September 2025 / Published: 30 September 2025

Download

Browse Figures

Versions Notes

Abstract

This study introduces a novel mathematical framework for a machine learning algorithm tailored to address linear regression problems in the presence of non-Gaussian estimation noise. In particular, we focus on Bernoulli–Gaussian noise, which frequently occurs in practical scenarios such as wireless communication channels and signal processing systems. We apply our framework within the context of wireless systems, particularly emphasizing its utility in channel estimation tasks. This article demonstrates the efficacy of linear regression in estimating wireless channel fading coefficients under the influence of additive Bernoulli–Gaussian noise. Through comparative analysis with Gaussian noise scenarios, we underscore the indispensability of the proposed framework. Additionally, we evaluate the performance of the maximum-likelihood estimator using gradient descent, highlighting the superiority of estimators tailored to non-Gaussian noise assumptions over those relying solely on simplified Gaussian models.

Keywords:

Bernoulli–Gaussian noise; channel estimation; gradient descent; linear regression

1. Introduction

Channel estimation plays a pivotal role in wireless communication systems, facilitating accurate signal reception amidst the challenges posed by various types of noise. Reliable estimation of the channel enables more efficient data transmission by compensating for distortions caused by the communication environment [1,2,3,4]. The task becomes particularly challenging when the channel is influenced by non-Gaussian noise models, such as Bernoulli–Gaussian (BG) noise [5]. This type of noise introduces intermittent and sparse impulses that can severely degrade the performance of traditional channel estimation techniques. In most conventional approaches, such as those assuming additive white Gaussian noise (AWGN), linear regression-based estimators have been widely used due to their simplicity and effectiveness in estimating the channel coefficients. However, when the noise deviates from the Gaussian assumption, such as in scenarios with impulsive noise, these methods often fail to provide optimal performance [6,7].

The BG noise model, which combines a sparse noise process with Gaussian perturbations, is a more realistic representation for various wireless environments. It includes urban and industrial settings where bursty interference is common. Several real-world datasets and measurement campaigns have demonstrated the presence of impulsive noise that follows a BG distribution, particularly in vehicular and cognitive radio networks [8]. Studies like [9] also show that power-line noise consists of both Gaussian background noise and impulsive components, which can be modeled accurately using a BG distribution. Moreover, it has been shown that wireless channels often exhibit heavy-tailed noise characteristics beyond Gaussian assumptions, making BG and even Cauchy-based noise models highly relevant for practical deployment [10,11].

While least squares (LS) and linear minimum mean square error (LMMSE) estimators remain widely adopted for channel estimation under Gaussian noise, they fail to maintain accuracy in the presence of impulsive, heavy-tailed interference [12,13,14]. Robust regression approaches, such as M-estimators and expectation–maximization (EM) algorithms, can partially address impulsive effects, but they typically involve high computational burden and lack closed-form tractability in non-Gaussian settings. In contrast, our work introduces a regression framework that explicitly incorporates the BG distribution, leveraging the log-sum inequality (LSI) to derive a tractable negative log-likelihood bound and a gradient descent-based estimator. This enables robustness to impulsive noise while preserving interpretability and computational efficiency. Unlike deep learning-based approaches that demand large datasets and high training complexity, the proposed framework provides a lightweight alternative suitable for real-time deployment in wireless systems.

This paper proposes a novel approach to channel estimation tailored for scenarios characterized by BG noise. We explore the limitations of conventional methods and propose an enhanced linear regression approach that incorporates the characteristics of Bernoulli–Gaussian noise to improve the estimation accuracy [15,16,17]. Additionally, we compare the performance of this method with advanced techniques, such as the LSI-based estimation, which is designed to handle sparse noise distributions. Leveraging the principles of linear regression, our method aims to accurately estimate the channel response, even in the presence of significant noise distortion [18,19]. The proposed technique begins by modeling the received signal as a linear combination of the transmitted symbols convolved with channel impulse response with added noise, which is BG in nature. We then formulate the channel estimation problem as a linear regression task, where the objective is to learn the parameters of the channel response matrix from the received signal samples. Other regression-based machine learning and deep learning models comprising neural networks have been used in several research articles for channel estimation [16,20,21,22,23,24].

Furthermore, with the rapid evolution of wireless technologies towards 5G, 6G, and beyond, the role of accurate channel estimation has become even more critical. Emerging paradigms such as massive multiple-input multiple-output (MIMO), reconfigurable intelligent surface (RIS), integrated sensing and communication (ISAC), and millimeter-wave (mmWave) transmission demand highly reliable estimation methods to ensure low-latency and high-throughput performance [1,2]. In such advanced architectures, non-Gaussian noise, particularly impulsive BG interference, is becoming increasingly common due to spectrum sharing, dense deployment, and hybrid heterogeneous networks [5,9,25].

Recent studies on deep learning (DL)-based channel estimation [21,22] and unfolded iterative solvers [19,26] have demonstrated improved robustness to non-linear channel conditions. But these approaches often require large training datasets, suffer from high computational complexity, and lack interpretability [27]. In contrast, the proposed regression-based framework retains computational efficiency while explicitly modeling impulsive BG noise characteristics, striking an optimal balance between robustness and tractability.

The main contributions of this paper are summarized as follows:

We employ a robust regression framework to address the challenges posed by BG noise [28] which combines discrete and continuous characteristics. By incorporating suitable regularization techniques and optimization algorithms, our method effectively mitigates the impact of noise on the channel estimation process [29].
We derive a closed-form approximation of the maximum-likelihood estimator and design an iterative gradient descent approach optimized for sparse noise distributions.
We evaluate the performance of the proposed approach through extensive simulations and compare it with existing methods under non-Gaussian noise conditions. The results demonstrate the superior accuracy and robustness of our technique, particularly in scenarios with high noise levels and sparse channel impulse responses.
We provide insights into the applicability of the proposed framework for emerging wireless paradigms, including RIS-aided systems, massive MIMO, and ISAC-driven 6G communications. Overall, the proposed channel estimation method offers a promising solution for wireless communication systems operating in environments prone to Bernoulli–Gaussian noise, thus enhancing reliability and performance in practical deployment scenarios.

2. System Model

We consider an end-to-end multiple-input single-output communication system with additive non-Gaussian noise, as shown in Figure 1. In this system, the goal is to estimate the wireless channel by using linear regression techniques while dealing with additive Bernoulli–Gaussian noise. In emerging wireless systems, noise environments deviate significantly from Gaussian assumptions due to impulsive interference caused by device switching, electromagnetic emissions, and spectrum-sharing mechanisms. BG noise captures these dynamics effectively by combining a Gaussian background with sparse impulses, making it particularly suitable for power-line communication (PLC), RIS-assisted mmWave, and dense Internet of Things (IoT) deployment [5,9].

We consider a data stream represented by a real-valued unitary matrix given by

X = {[x_{1}, \dots, x_{M}]}^{T}

, where

{(\cdot)}^{T}

is the transpose operator and each column

x_{i}^{T} \in R^{1 \times N}

is transmitted through an N antenna system. As a result, we receive baseband signal

y = {[y_{1}, \dots, y_{M}]}^{T}

sampled at M observations, considering each instant of the time frame individually. The main focus of this work is to use parametric models, which entail choosing a function with parameters and figuring out the best parameter values

θ \in R^{N \times 1}

to describe the provided data. In general, the functional dependence of

y

on

X

is defined as:

y_{i} = x_{i}^{T} θ + ϵ_{i} \Rightarrow ϵ_{i} = y_{i} - x_{i}^{T} θ, i = {1, \dots, M},

(1)

where

ϵ_{i}

is complex BG noise. Here,

θ

indicates the unknown channel-related parameters.

We assume that the i-th sample of an independent variable,

x_{i}

, is supplied and that a dependent variable,

y_{i}

, is the noisy observation that is produced in parallel with the input. Although the framework is presented for a SISO setting for clarity, it can be extended to multi-antenna scenarios (MISO/MIMO) by generalizing the regression formulation across multiple transmit and/or receive antennas. Such extensions are reserved for future work. We use the signal-to-noise ratio (SNR) to express how strong a signal is in relation to noise levels; it is generalized as

SNR ≜ \frac{| x_{i}^{T} {θ |}^{2}}{E {| ϵ_{i} |^{2}}}, i = 1, . . ., M,

(2)

where

E {\cdot}

is the expectation operator. In order to evaluate the accuracy of the predictive models, mean square error (MSE) is used as a performance metric in machine learning [30]. It is the averaged mean over channel realization, including number of observations and training samples:

MSE = E \{\sum_{j = 1}^{N} {(θ_{j} - {\hat{θ}}_{j})}^{2}\},

(3)

where

θ_{j}

is the true value and

{\hat{θ}}_{j}

is its estimate.

2.1. Approximate Maximum-Likelihood Estimator Using Log-Sum Inequality

Let us formulate our problem from (1) by stating the probability density function (PDF) of the Bernoulli–Gaussian noise as [31]

p (ϵ_{i}) = w_{1} \frac{1}{\sqrt{2 π N_{o 1}}} \exp (- \frac{{ϵ_{i}}^{2}}{2 N_{o 1}}) + w_{2} \frac{1}{\sqrt{2 π N_{o 2}}} \exp (- \frac{{ϵ_{i}}^{2}}{2 N_{o 2}}) .

(4)

Here,

w_{1} + w_{2} = 1

, where

w_{i}, i \in {1, 2}

is the hyperparameter that gives the prior probability of the noise originating with variance

N_{o i}

. The Bernoulli–Gaussian (BG) noise model assumes that each noise sample

ϵ_{i}

is drawn from a mixture of two Gaussian distributions: with probability

w_{1}

,

ϵ_{i}

is Gaussian with variance

N_{o 1}

(background noise), and with probability

w_{2} = 1 - w_{1}

,

ϵ_{i}

is Gaussian with variance

N_{o 2}

(impulsive noise). Thus,

w_{2}

or

p_{i m p}

explicitly represents the impulse probability in the noise model. It represents the likelihood that a noise sample originates from the impulsive component. A useful measure of impulsive severity is the variance ratio,

r = \frac{N_{o 2}}{N_{o 1}},

(5)

which quantifies the relative strength of the impulsive noise with respect to the background noise. Larger values of r correspond to more dominant impulsive components, while smaller values approach the Gaussian background case. Robust performance, therefore, requires the estimator to maintain stable accuracy across a range of r and

p_{imp}

values. By denoting the Bernoulli–Gaussian distribution by BG and substituting (1) in (4), we get

p (y | X, θ) = BG (y | X θ, N_{o 1}, N_{o 2}, w_{1}, w_{2}),

(6)

where

BG (y | X θ, N_{o 1}, N_{o 2}, w_{1}, w_{2})

can be expressed by replacing

ϵ_{i}

with

y_{i} - x_{i}^{T} θ

in (4). Let us now move to parameter estimation in the following manner:

\begin{matrix} p (y | X, θ) & = & p (y_{1}, y_{2}, . . . . . . . ., y_{M} | x_{1}, x_{2}, . . . . . . . . . x_{M}, θ), \\ = & \prod_{i = 1}^{M} BG (y_{i} | x_{i}^{T} θ, N_{o 1}, N_{o 2}, w_{1}, w_{2}) . \end{matrix}

(7)

The goal of maximum-likelihood analysis is to maximize the likelihood function of the received data samples, considering the channel coefficients. The maximum-likelihood estimator (MLE) can be defined using the following equation:

θ_{ML} = \underset{θ}{arg max} p (y | X, θ) .

(8)

It is evident that maximizing the log-likelihood function is equivalent to minimizing the negative log-likelihood function. In order to minimize the obtained negative log-likelihood function, we need to use the gradient descent method. Hence, we take negative log-likelihood of the PDF as follows:

\begin{matrix} - log p (y | x_{i}, θ) & = & - log \prod_{i = 1}^{M} p (y_{i} | x_{i}, θ) = - \sum_{i = 1}^{M} log p (y_{i} | x_{i}, θ), \\ = & - \sum_{i = 1}^{M} log (\frac{w_{1}}{\sqrt{2 π N_{o 1}}} exp (\frac{- {(y_{i} - x_{i}^{T} θ)}^{2}}{2 N_{o 1}}) \\ + \frac{w_{2}}{\sqrt{2 π N_{o 2}}} exp (\frac{- {(y_{i} - x_{i}^{T} θ)}^{2}}{2 N_{o 2}})), \\ = & - log (\sum_{m = 1}^{2} \frac{w_{m}}{\sqrt{2 π N_{o m}}} exp (\frac{- {ϵ_{i}}^{2}}{2 N_{o m}})), \\ = & log (\frac{1}{\sum_{m = 1}^{2} \frac{w_{m}}{\sqrt{2 π N_{o m}}} exp (\frac{- {ϵ_{i}}^{2}}{2 N_{o m}})}) . \end{matrix}

(9)

It is instructive to observe that in the special case where

w_{1} = 1

and

w_{2} = 0

, the mixture distribution in (8) reduces to the Gaussian case. In this scenario, minimizing the negative log-likelihood is equivalent to minimizing

\sum_{i = 1}^{M} {(y_{i} - x_{i}^{T} θ)}^{2}

, which corresponds exactly to the ordinary least squares (OLS) criterion. This connection highlights that the log-sum inequality (LSI) provides a generalized framework, which reduces to OLS under purely Gaussian assumptions.

Furthermore, to avoid ambiguity, we explicitly note that

N_{o 1}

and

N_{o 2}

denote the variance terms of the two Gaussian components, and

w_{1}

and

w_{2}

are the corresponding mixture weights satisfying

w_{1} + w_{2} = 1

. The expansion shows explicitly how the LSI provides a tractable upper bound on the negative log-likelihood. The scaling terms

N_{o m}

and

w_{m}

are preserved throughout, which ensures that the derivation maintains consistency with the original probabilistic model.

Using the LSI

\sum a_{m} \log \frac{a_{m}}{b_{m}} \geq a \log \frac{a}{b}

, where

a = \sum a_{m}, b = \sum b_{m}

, we have

\sum a_{m} \sum \log \frac{a_{m}}{b_{m}} \geq \sum a_{m} \log \frac{\sum a_{m}}{\sum b_{m}}, where \sum a_{m} = 1 .

(10)

Considering the equi-probability of occurrence of noise with variance

N_{o m}

, we take the value of

a_{m} = 0.5

for

m \in {1, 2}

:

\sum b_{m} = \sum_{m = 1}^{2} w_{m} \frac{1}{\sqrt{2 π N_{o m}}} \exp (\frac{- {(y_{i} - x_{i}^{T} θ)}^{2}}{2 N_{o m}}) .

(11)

By substituting (11) into (10), we get

\begin{matrix} \sum_{m = 1}^{2} 0.5 \log (\frac{0.5}{\frac{w_{m}}{\sqrt{2 π N_{o m}}}} \exp (\frac{- {(y_{i} - x_{i}^{T} θ)}^{2}}{2 N_{o m}})) \geq \\ \log (\frac{1}{\sum_{m = 1}^{2} w_{m} \frac{1}{\sqrt{2 π N_{o m}}} \exp (\frac{- {(y_{i} - x_{i}^{T} θ)}^{2}}{2 N_{o m}})}) . \end{matrix}

(12)

We observe that the right-hand side of (12) matches that of (9). From the LSI, we can write (9) as

\begin{matrix} - \log p (y | x_{i}, θ) & \leq & \sum_{m = 1}^{2} 0.5 \log (\frac{0.5}{\frac{w_{m}}{\sqrt{2 π N_{o m}}} \exp (\frac{- {(y_{i} - x_{i}^{T} θ)}^{2}}{2 N_{o m}})}), \\ = & 0.5 \log (\frac{0.5}{\frac{w_{1}}{\sqrt{2 π N_{o 1}}} \exp (\frac{- {(y_{i} - x_{i}^{T} θ)}^{2}}{2 N_{o 1}})}), \\ + 0.5 log (\frac{0.5}{\frac{w_{2}}{\sqrt{2 π N_{o 2}}} exp (- \frac{{(y_{i} - x_{i}^{T} θ)}^{2}}{2 N_{o 2}})}) \\ = & \log 0.5 - 0.5 (\log \frac{w_{1}}{\sqrt{2 π N_{o 1}}} + \log \frac{w_{2}}{\sqrt{2 π N_{o 2}}}) \\ + {(y_{i} - x_{i}^{T} θ)}^{2} (\frac{1}{2 N_{o 1}} + \frac{1}{2 N_{o 2}}) . \end{matrix}

(13)

Further, (13) can be written as

\begin{matrix} - \log p (y | x_{i}, θ) & \leq & \log 0.5 - 0.5 \log (\frac{w_{1} w_{2}}{2 π \sqrt{N_{o 1} N_{o 2}}}), \\ + {(y_{i} - x_{i}^{T} θ)}^{2} (\frac{1}{2 N_{o 1}} + \frac{1}{2 N_{o 2}}) . \end{matrix}

(14)

In terms of square distance between

y

and

X θ

, we can write (14) as

- \log p (y | x_{i}, θ) or L (θ) = (\frac{1}{2 N_{o 1}} + \frac{1}{2 N_{o 2}}) {(y_{i} - x_{i}^{T} θ)}^{2} + k,

(15)

L (θ) = (\frac{1}{2 N_{o 1}} + \frac{1}{2 N_{o 2}}) | | y - {X θ | |}^{2} + k,

(16)

where k is a relative constant. By computing the partial derivative of

L (θ)

with respect to

θ^{T}

, we get

\begin{matrix} \frac{\partial L (θ)}{\partial θ^{T}} & = & \frac{\partial [(\frac{1}{2 N_{o 1}} + \frac{1}{2 N_{o 2}}) {(y - X θ)}^{T} (y - X θ)]}{\partial θ^{T}}, \\ = & (\frac{1}{2 N_{o 1}} + \frac{1}{2 N_{o 2}}) (- X^{T} y + X^{T} X θ) . \end{matrix}

(17)

The expression for

θ

derived under the LSI can be further simplified to a sub-optimal, closed form from (17) as

θ = {(X^{T} X)}^{- 1} X^{T} y .

(18)

We aim to iteratively refine the estimate

θ

by utilizing the gradient of the least squares error (LSE) between the received signal and the estimated channel response until the error is minimized. Consequently, we iterate the suggested iterative gradient descent algorithm until the computed error falls below a predetermined threshold

(α)

. The variable

γ

denotes either the step size or the learning rate. Additionally, a residual threshold

(r e s)

indicates the convergence point for the estimated channel coefficients, determining when to stop further iteration.

2.2. Proposed Maximum-Likelihood Estimator (Numerical Optimization)

To perform channel estimation using the gradient descent method and linear regression, we first need to define the problem and establish the objective function. Given the PDF for the BG noise model in (4), we denote the i-th antenna’s received signal by

y_{i}

and transmitted signal by

x_{i}

.

Our aim is to perform channel estimation for BG noise (impulsive); by substituting the PDF

p (ϵ_{i})

in (4) into the negative log-likelihood function, we get

\begin{matrix} J (θ) = \sum_{i = 1}^{M} \log ( & \frac{w_{1}}{\sqrt{2 π N_{o 1}}} exp (\frac{- {(y_{i} - x_{i}^{T} θ)}^{2}}{2 N_{o 1}}) \\ + \frac{w_{2}}{\sqrt{2 π N_{o 2}}} exp (\frac{- {(y_{i} - x_{i}^{T} θ)}^{2}}{2 N_{o 2}})) . \end{matrix}

(19)

To obtain the gradient of

J (θ)

in (19), we differentiate each term with respect to

θ

. Because (19) contains a logarithm of a sum, we apply the chain rule. The resulting expression is shown in (20), and detailed intermediate steps are provided in Appendix A for clarity.

\begin{matrix} \nabla J (θ) = \frac{\partial J (θ)}{\partial θ} & = & - \sum_{i = 1}^{M} \frac{1}{\frac{w_{1}}{\sqrt{2 π N_{o 1}}} \exp (\frac{- {(y_{i} - x_{i}^{T} θ)}^{2}}{2 N_{o 1}}) + \frac{w_{2}}{\sqrt{2 π N_{o 2}}} \exp (\frac{- {(y_{i} - x_{i}^{T} θ)}^{2}}{2 N_{o 2}})} \\ \times (\frac{w_{1}}{\sqrt{2 π N_{o 1}}} \frac{\partial}{\partial θ} \exp (\frac{- {(y_{i} - x_{i}^{T} θ)}^{2}}{2 N_{o 1}}) + \frac{w_{2}}{\sqrt{2 π N_{o 2}}} \frac{\partial}{\partial θ} \exp (\frac{- {(y_{i} - x_{i}^{T} θ)}^{2}}{2 N_{o 2}})), \\ = & \sum_{i = 1}^{M} \frac{1}{N_{o 1} N_{o 2} (\sqrt{N_{o 2}} w_{1} e^{\frac{{(x_{i}^{T} θ - y_{i})}^{2}}{2 N_{o 2}}} + \sqrt{N_{o 1}} w_{2} e^{\frac{{(x_{i}^{T} θ - y_{i})}^{2}}{2 N_{o 1}}})} \\ (x_{i} (x_{i}^{T} θ - y_{i}) (N_{o 2}^{\frac{3}{2}} w_{1} e^{\frac{{(x_{i}^{T} θ - y_{i})}^{2}}{2 N_{o 2}}} + N_{o 1}^{\frac{3}{2}} w_{2} e^{\frac{{(x_{i}^{T} θ - y_{i})}^{2}}{2 N_{o 1}}})) . \end{matrix}

(20)

Further, we can update the parameter vector

θ

iteratively using the gradient descent algorithm:

θ^{k + 1} = θ^{k} - γ (\nabla J (θ^{k})),

(21)

where

θ^{k}

denotes the parameter vector at the k-th iteration,

γ

is the learning rate, and

\nabla J (θ^{k})

is the gradient of

J (θ)

evaluated at

θ^{k}

. Similar to Algorithm 1 we use the gradient descent method to update

θ

until the algorithm converges.

Algorithm 1 Iterative gradient descent algorithm.

Require:

\frac{\partial J (θ)}{\partial θ} = \sum_{i = 1}^{M} \frac{x_{i} (x_{i}^{T} θ - y_{i})}{(\sqrt{N_{o 2}} w_{1} e^{\frac{{(x_{i}^{T} θ - y_{i})}^{2}}{2 N_{o 2}}} + \sqrt{N_{o 1}} w_{2} e^{\frac{{(x_{i}^{T} θ - y_{i})}^{2}}{2 N_{o 1}}})}

\times \frac{N_{o 2}^{3 / 2} w_{1} e^{\frac{{(x_{i}^{T} θ - y_{i})}^{2}}{2 N_{o 2}}} + N_{o 1}^{3 / 2} w_{2} e^{\frac{{(x_{i}^{T} θ - y_{i})}^{2}}{2 N_{o 1}}}}{N_{o 1} N_{o 2}}

▹ Channel to be estimated

1:: Initialize $θ = [1, 1, \dots, 1]$
2:: Calculate initial error $= ∥ y - X^{T} {θ ∥}^{2}$
3:: $k = 0$
4:: repeat
5:: $\begin{matrix} θ^{k} = θ^{k - 1} - γ \sum_{i = 1}^{M} log (\frac{w_{1}}{\sqrt{2 π N_{o 1}}} exp (\frac{- {(y_{i} - x_{i}^{T} θ^{k - 1})}^{2}}{2 N_{o 1}}) + \frac{w_{2}}{\sqrt{2 π N_{o 2}}} exp (\frac{- {(y_{i} - x_{i}^{T} θ^{k - 1})}^{2}}{2 N_{o 2}})) \end{matrix}$
▹ Step Updation
6:: Calculate error $= ∥ y - X^{T} θ^{k} ∥^{2}$
7:: until $abs (∥ y - X^{T} θ^{k - 1} ∥^{2} - {∥ y - X^{T} θ^{k} ∥}^{2}) < α$
▹ Convergence

3. Methodology

In order to evaluate the proposed channel estimation framework under BG noise, we performed extensive Monte Carlo simulations. Each experiment consists of transmitting a sequence of pilot symbols through a wireless channel with additive BG noise. The following parameters were considered for the simulations (Table 1):

This setup (Table 1) enables a fair comparison between different channel estimation methods for identical noise environments and fading conditions. The MSE between the estimated channel coefficients and the true channel parameters was used as the primary performance metric.

LSI-based approximation is employed to handle the Bernoulli–Gaussian distribution, which consists of a mixture of Gaussian components with different variances. By applying the LSI, the negative log-likelihood function can be bounded and simplified, enabling tractable optimization.

The gradient descent algorithm is then applied to iteratively minimize the resulting loss function. Compared with closed-form MLE, gradient descent provides

Lower computational complexity in high-dimensional settings;
Robustness to impulsive noise samples;
Flexibility in adapting to varying noise variances.

Recently, machine learning and DL methods have been applied to channel estimation, showing promising results under complex fading and non-linear channel conditions [21,22,27]. However, these methods typically suffer from three limitations:

Requirement of large labeled training datasets;
High computational complexity during both training and inference;
Lack of interpretability, which restricts practical deployment.

In contrast, the proposed regression-based framework explicitly models the impulsive characteristics of BG noise while maintaining computational efficiency and interpretability. This ensures robustness in non-Gaussian noise environments without relying on data-intensive training procedures.

The least squares (LS) estimator is obtained by minimizing the quadratic loss function

J_{LS} (θ) = \frac{1}{2} {∥ y - X θ ∥}_{2}^{2},

(22)

which is optimal only under Gaussian noise assumptions [32]. The proposed estimator, BG-MLE, minimizes the negative log-likelihood of the Bernoulli–Gaussian mixture model.

J_{BG} (θ) = - \sum_{i = 1}^{M} log (w_{1} N (y_{i} - x_{i}^{T} θ; 0, N_{o 1}) + w_{2} N (y_{i} - x_{i}^{T} θ; 0, N_{o 2})),

(23)

where

w_{1} = 1 - p_{imp}

and

w_{2} = p_{imp}

denote the background and impulsive probabilities, respectively. Since the BG likelihood is non-convex, we employ an EM/IRLS approach in which the weights

γ_{1}

and

γ_{2}

act as sample-wise responsibilities, leading to iterative weighted least squares updates.

4. Results and Discussion

In this section, we illustrate the impact of Bernoulli–Gaussian noise on channel estimation for a single-antenna wireless communication system under Rayleigh fading. Figure 2 illustrates the relationship between the MSE and the SNR for four different parameter settings of the gradient descent algorithm in the context of Bernoulli–Gaussian noise using (20) for

N_{o 1} = 0.6

and

N_{o 2} = 0.2

(as in Algorithm 1). As the SNR increases, the MSE decreases, indicating improved estimation accuracy at higher SNR levels. The curve corresponding to

γ = 5 \times 10^{- 4}

and

r e s = 10^{- 4}

demonstrates a relatively unstable and high MSE across the entire SNR range, suggesting that this parameter setting fails to provide a good balance between convergence speed and stability. The curve for

γ = 10^{- 4}

and

r e s = 10^{- 5}

suggests moderate performance. The parameter setting

γ = 10^{- 3}

and

r e s = 10^{- 5}

indicates high accuracy but potentially slower convergence. The parameter setting

γ = 0.1

and

r e s = 10^{- 6}

results in significantly lower MSE values. This indicates that a smaller step size can lead to poor estimation accuracy and instability. The choice of step size

γ

and residual threshold (

r e s

) critically affects the convergence behavior of the gradient descent algorithm. A large step size (e.g.,

γ = 5 \times 10^{- 4}

) leads to oscillations and instability, yielding higher MSE across the SNR range. Conversely, a very small step size (e.g.,

γ = 10^{- 4}

) results in more stable convergence but at the cost of slower adaptation. The residual threshold

r e s

governs the stopping condition: loose thresholds may terminate prematurely, while overly strict values increase computational effort without significant accuracy gains. Hence, optimal performance requires balancing

γ

and

r e s

, as reflected in the curve corresponding to

γ = 10^{- 3}

and

r e s = 10^{- 5}

. Although

95 %

confidence intervals were also computed for these results, they are not plotted in Figure 2 in order to maintain clarity of the comparison of step size versus residual threshold. The intervals confirm that the trends observed are statistically reliable.

Figure 3 illustrates the performance of MSE versus SNR for Gaussian noise and Bernoulli–Gaussian noise with two different sets of variance. In order to use the proposed estimator given in (20) for Gaussian noise, we ignore the Bernoulli term by setting its probability of impulses to zero. The curve corresponding to a variance of

N_{0} = 0.9

consistently shows a higher MSE compared with the curve with a variance of

N_{0} = 0.6

across the entire SNR range. This indicates that Gaussian noise with a smaller variance yields better (lower MSE) at higher SNR values. The MSE for Gaussian noise follows a predictable pattern. Similarly, the MSE for Bernoulli–Gaussian noise decreases with the increase in the SNR but at a rate different from that of Gaussian noise. The curves for Bernoulli–Gaussian noise with variances of

N_{o 1} = 0.9

and

N_{o 2} = 0.5

show a higher MSE than those with variances of

N_{o 1} = 0.6

and

N_{o 2} = 0.2

at higher SNR values, indicating better performance for higher variance. This figure highlights the impact of Bernoulli–Gaussian noise on MSE performance, showing how the Bernoulli-distributed impulses affect the overall error compared with Gaussian noise.

Figure 4 illustrates the performance of two channel estimation methods, approximate MLE using the LSI and the proposed MLE using the iterative gradient descent method by seeking derivatives from (17) and (20) for Bernoulli–Gaussian noise, for two sets of noise variances. The MSE is plotted against the SNR for each estimation technique. For both variance sets

(N_{o 1} = 0.9, N_{o 2} = 0.5)

and

(N_{o 1} = 0.6, N_{o 2} = 0.2)

, the proposed MLE consistently outperforms the LSI method across the entire range of SNR values. Notably, the MSE for the proposed MLE decreases more rapidly with the increase in the SNR, particularly for negative SNR values, demonstrating better robustness under low-SNR conditions. This aligns with the expectation that the proposed MLE method offers more reliable channel estimation across all data points. In contrast, the approximate MLE using LSI curves shows a higher MSE at low SNR values, indicating a lower convergence rate as the noise becomes more pronounced. However, with the SNR

> 10

dB, the performance of both estimators begins to converge, though the proposed MLE still maintains a lower MSE. This suggests that while LSI methods might be suitable in higher-SNR regimes, they are less effective under noisier conditions, as evidenced by the flatter decay in MSE for lower SNR values. The results indicate that the proposed MLE can better adapt to varying noise conditions, making it a preferred choice for practical communication systems where maintaining accurate channel estimates across a wide range of SNR values is crucial.

Convergence is analyzed both in terms of the excess negative log-likelihood and parameter MSE. The LS baseline optimizes the quadratic loss function, while BG-MLE optimizes the Bernoulli–Gaussian log-likelihood through iterative reweighting. Figure 5 provides a detailed convergence analysis of the proposed estimator, BG-MLE, in comparison with the LS baseline. Figure 5a shows the normalized excess negative log-likelihood as a function of the iteration index. Although the LS method exhibits a steeper initial decline, this rapid cost reduction is deceptive, as it does not guarantee accurate estimation under impulsive noise. Figure 5b presents the normalized parameter MSE versus iterations. Here, the advantage of BG-MLE becomes evident: the proposed method reduces the estimation error by over two orders of magnitude within the first few iterations and remains stable thereafter. In contrast, the LS method stagnates at a high error level, failing to capture the channel parameters in the presence of Bernoulli–Gaussian noise. These results demonstrate that BG-MLE provides both reliable convergence and significantly superior estimation accuracy compared with the LS estimator, thereby validating its robustness under non-Gaussian noise conditions.

Figure 6 illustrates the robustness of the proposed estimator, BG-MLE, compared with the LS baseline under varying impulsive noise conditions. Figure 6a plots the parameter MSE versus SNR for different impulsive variance ratios

r = N_{o 2} / N_{o 1}

at a fixed impulse probability

p_{imp} = 0.2

. As expected, all curves exhibit decreasing MSE with increasing SNR. However, BG-MLE achieves consistently lower error across the full SNR range and remains resilient as r increases from 5 to 30. The LS baseline performs significantly worse, particularly in low- to mid-SNR regimes, highlighting its sensitivity to impulsive components. This demonstrates that BG-MLE maintains robust performance even as the impulsive-to-background noise power ratio varies. Figure 6b presents the MSE versus impulse probability

p_{imp}

at a fixed SNR of 10 dB and a variance ratio

r = 15

. Here, BG-MLE sustains low MSE across the full range of

p_{imp}

values, while the LS estimator shows consistently higher error and negligible adaptation to the impulsive component. The inclusion of 95% confidence intervals (shaded regions) confirms the statistical reliability of the trends observed. The results clearly demonstrate that BG-MLE not only achieves superior accuracy compared with LS, but also exhibits robustness against variations in both impulsive variance ratio and impulse probability, thereby addressing the reviewer’s concern regarding additional performance metrics beyond MSE versus SNR.

All reported results are obtained by averaging over 1000 independent Monte Carlo realizations to ensure statistical robustness. Results are averaged over 1000 independent runs with seed control (rng(42) in MATLAB). For each performance curve,

95 %

confidence intervals were also computed. The error bars are explicitly shown in Figure 5 and Figure 6.

The computational complexity of the proposed estimator is significantly lower than that of LSI-based MLE approaches. The closed-form MLE requires matrix inversion, which has complexity

O (N^{3})

, where N is the number of channel coefficients. In contrast, the proposed gradient descent estimator has complexity

O (k N^{2})

per iteration, where k is the number of iterations required for convergence. This makes the approach suitable for practical real-time implementations.

5. Conclusions

This work presented a robust channel estimation framework for wireless communication systems operating in BG noise environments. We developed an enhanced linear regression-based approach that explicitly incorporates the impulsive characteristics of BG noise and evaluated its performance against conventional Gaussian noise-based estimators and advanced maximum-likelihood methods. Through detailed analysis and extensive simulations, we demonstrated that the proposed framework achieves superior estimation accuracy and robustness, particularly in low-SNR regimes and scenarios characterized by sparse channel impulse responses. The results confirmed that the gradient descent-based estimator, derived from the log-sum inequality formulation, provides a favorable trade-off between accuracy and computational complexity. Specifically, the proposed approach achieves significant MSE improvements compared with Gaussian-based estimators while maintaining lower complexity than LSI-based closed-form MLE solutions. Owing to its computational efficiency and robustness under impulsive noise, the proposed framework is well-suited for real-time and resource-constrained wireless systems.

Beyond its methodological contributions, this study highlights the broader significance of BG noise modeling in modern communication environments. In applications such as power-line communications, vehicular ad hoc networks, and dense IoT deployment, impulsive interference frequently dominates, making Gaussian assumptions unrealistic. The proposed approach offers a practical solution that can be extended to advanced wireless paradigms, including massive MIMO, RIS, and mmWave transmission, and ISAC architectures envisioned for 6G systems. While the current work focuses on simulation-based evaluation, future research should extend this framework to experimental validation using real-world datasets. Moreover, hybrid approaches that combine machine learning models with regression-based estimation could further enhance adaptability under highly dynamic non-Gaussian environments. Another promising direction lies in extending the proposed methodology to multi-antenna and multi-user scenarios, as well as integrating it with adaptive pilot design strategies.

In summary, the proposed linear regression-based estimator provides an analytically tractable, computationally efficient, and practically relevant solution for robust channel estimation under Bernoulli–Gaussian noise. By bridging the gap between traditional statistical signal processing and emerging requirements of 5G/6G networks, this work contributes a valuable step towards reliable communication in non-Gaussian wireless environments. Future extensions of this work will explore hybrid regression-based methods and comparisons under Cauchy noise environments to further strengthen generalizability across diverse non-Gaussian noise models.

Author Contributions

Conceptualization, P.C., B.R.M., and M.B.; methodology, P.C., I.C., and M.B.; software, P.C.; validation, I.C.; formal analysis, P.C.; investigation, P.C., B.R.M., I.C., and M.B.; resources, B.R.M. and M.B.; data curation, M.B.; writing—original draft preparation, P.C.; writing—review and editing, I.C.; visualization, B.R.M.; supervision, M.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research study received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

In this appendix, we present the detailed derivation of the gradient expression for the proposed maximum-likelihood estimator under Bernoulli–Gaussian (BG) noise. Starting from the log-likelihood function in (19), we have

J (θ) = \sum_{i = 1}^{M} log (\frac{w_{1}}{\sqrt{2 π N_{o 1}}} exp (- \frac{{(y_{i} - x_{i}^{T} θ)}^{2}}{2 N_{o 1}}) + \frac{w_{2}}{\sqrt{2 π N_{o 2}}} exp (- \frac{{(y_{i} - x_{i}^{T} θ)}^{2}}{2 N_{o 2}})),

(A1)

where

$w_{1}$ and $w_{2}$ are the Bernoulli mixing probabilities, satisfying $w_{1} + w_{2} = 1$ ;
$N_{o 1}$ and $N_{o 2}$ are the noise variances;
$y_{i}$ is the received signal;
$x_{i}$ is the transmitted pilot vector;
$θ$ denotes the unknown channel coefficient vector.

We define the per-sample mixture term

S_{i} (θ) ≜ \frac{w_{1}}{\sqrt{2 π N_{o, 1}}} exp (- \frac{{(y_{i} - x_{i}^{T} θ)}^{2}}{2 N_{o, 1}}) + \frac{w_{2}}{\sqrt{2 π N_{o, 2}}} exp (- \frac{{(y_{i} - x_{i}^{T} θ)}^{2}}{2 N_{o, 2}}) .

Using the chain rule,

\frac{\partial}{\partial θ} exp (- \frac{{(y_{i} - x_{i}^{T} θ)}^{2}}{2 N_{o, m}}) = exp (- \frac{{(y_{i} - x_{i}^{T} θ)}^{2}}{2 N_{o, m}}) \frac{(y_{i} - x_{i}^{T} θ) x_{i}}{N_{o, m}} .

\begin{matrix} \nabla J (θ) = \frac{\partial J (θ)}{\partial θ} & = & - \sum_{i = 1}^{M} \frac{1}{\frac{w_{1}}{\sqrt{2 π N_{o 1}}} \exp (\frac{- {(y_{i} - x_{i}^{T} θ)}^{2}}{2 N_{o 1}}) + \frac{w_{2}}{\sqrt{2 π N_{o 2}}} \exp (\frac{- {(y_{i} - x_{i}^{T} θ)}^{2}}{2 N_{o 2}})} \\ \times (\frac{w_{1}}{\sqrt{2 π N_{o 1}}} \frac{\partial}{\partial θ} \exp (\frac{- {(y_{i} - x_{i}^{T} θ)}^{2}}{2 N_{o 1}}) + \frac{w_{2}}{\sqrt{2 π N_{o 2}}} \frac{\partial}{\partial θ} \exp (\frac{- {(y_{i} - x_{i}^{T} θ)}^{2}}{2 N_{o 2}})), \\ = & \sum_{i = 1}^{M} \frac{1}{N_{o 1} N_{o 2} (\sqrt{N_{o 2}} w_{1} e^{\frac{{(x_{i}^{T} θ - y_{i})}^{2}}{2 N_{o 2}}} + \sqrt{N_{o 1}} w_{2} e^{\frac{{(x_{i}^{T} θ - y_{i})}^{2}}{2 N_{o 1}}})} \\ (x_{i} (x_{i}^{T} θ - y_{i}) (N_{o 2}^{\frac{3}{2}} w_{1} e^{\frac{{(x_{i}^{T} θ - y_{i})}^{2}}{2 N_{o 2}}} + N_{o 1}^{\frac{3}{2}} w_{2} e^{\frac{{(x_{i}^{T} θ - y_{i})}^{2}}{2 N_{o 1}}})) . \end{matrix}

(A2)

Final Result (A2) matches the expression given in (20) of the main manuscript, but here we have shown the intermediate derivation steps for clarity.

Appendix A.1. Equivalence (Weighted-LS Form)

Starting from the objective in Equation (19), we have

γ_{i, m} ≜ \frac{\frac{w_{m}}{\sqrt{2 π N_{o, m}}} exp (- r_{i} {(θ)}^{2} / (2 N_{o, m}))}{\sum_{ℓ = 1}^{2} \frac{w_{ℓ}}{\sqrt{2 π N_{o, ℓ}}} exp (- r_{i} {(θ)}^{2} / (2 N_{o, ℓ}))}, r_{i} (θ) = x_{i}^{T} θ - y_{i} .

Let

r_{i} (θ) = x_{i}^{T} θ - y_{i}

and

w_{i} (θ) = γ_{i, 1} / N_{o, 1} + γ_{i, 2} / N_{o, 2}

. Then

\nabla J (θ) = \sum_{i = 1}^{M} r_{i} (θ) x_{i} w_{i} (θ) = X^{T} W (θ) (X θ - y),

with

W (θ) = diag (w_{i} (θ))

.

Appendix A.2. Sanity Checks

AWGN limit: IIf

w_{2} \to 0

(or

N_{o, 1} = N_{o, 2} = N_{o}

), then

γ_{i, 1} \to 1

and

\nabla J (θ) = \frac{1}{N_{0}} X^{T} (X θ - y)

. Symmetry: Swapping

(w_{1}, N_{o, 1})

with

(w_{2}, N_{o, 2})

leaves

\nabla J (θ)

unchanged since it depends on

γ_{i, 1} / N_{o, 1} + γ_{i, 2} / N_{o, 2}

.

Appendix A.3. Numerical Stability (Log-Sum-Exp)

For

S_{i} (θ) = \sum_{m = 1}^{2} \frac{w_{m}}{\sqrt{2 π N_{o, m}}} exp (- r_{i} {(θ)}^{2} / (2 N_{o, m}))

, let

a_{i} = {max}_{m} {- r_{i}^{2} / (2 N_{o, m}) - \frac{1}{2} log (2 π N_{o, m}) + log w_{m}}

, and compute

S_{i} = e^{a_{i}} \sum_{m = 1}^{2} exp (- r_{i}^{2} / (2 N_{o, m}) - \frac{1}{2} log (2 π N_{o, m}) + log w_{m} - a_{i}),

using the same shift in

γ_{i, m}

. This is algebraically equivalent and avoids underflow.

References

Wang, Z.; Liu, L.; Cui, S. Channel Estimation for Intelligent Reflecting Surface Assisted Multiuser Communications: Framework, Algorithms, and Analysis. IEEE Trans. Wirel. Commun. 2020, 19, 6607–6620. [Google Scholar] [CrossRef]
Chen, J.; Liang, Y.; Cheng, H.V.; Yu, W. Channel Estimation for Reconfigurable Intelligent Surface Aided Multi-User mmWave MIMO Systems. IEEE Trans. Wirel. Commun. 2023, 22, 6853–6869. [Google Scholar] [CrossRef]
Chatelier, B.; Le Magoarou, L.; Redieteab, G. Efficient Deep Unfolding for SISO-OFDM Channel Estimation. In Proceedings of the IEEE International Conference on Communications (ICC), Rome, Italy, 28 May–1 June 2023; pp. 3450–3455. [Google Scholar] [CrossRef]
Cuc, A.; Morgoş, F.L.; Grava, A.; Grava, C. Estimation of the Impulse Response of the AWGN Channel with ISI within an Iterative Equalization and Decoding System That Uses LDPC Codes. Entropy 2024, 26, 720. [Google Scholar] [CrossRef]
Yu, Y.; Yang, L.; Shen, Y. Bernoulli–Gaussian Model with Model Parameter Estimation. J. Surv. Eng. 2024, 150, 1518–1535. [Google Scholar] [CrossRef]
Wilson, A.M.; Panigrahi, T. Robust SIMO Channel Estimation Under Impulsive Noise with “Fair” Cost Function. In Proceedings of the Recent Trends in Intelligent Systems and Next Generation Wireless Communication (IIWCS 2024), Rourkela, India, 16–17 February 2024; Lecture Notes in Networks and Systems. Volume 1329, pp. 265–275. [Google Scholar] [CrossRef]
Zhang, M.; Zhou, X.; Wang, C. A Novel Noise Suppression Channel Estimation Method Based on Adaptive Weighted Averaging for OFDM Systems. Symmetry 2019, 11, 997. [Google Scholar] [CrossRef]
Zhao, X.; Li, F. Sparse Bayesian Compressed Spectrum Sensing Under Gaussian Mixture Noise. IEEE Trans. Veh. Technol. 2018, 67, 6087–6097. [Google Scholar] [CrossRef]
Mohan, V.; Mathur, A. Pulse Jamming in PLC Over Log-Normal Channel Gain with Bernoulli–Gaussian Additive Noise. IEEE Commun. Lett. 2023, 27, 2603–2607. [Google Scholar] [CrossRef]
Gülgün, Z.; Larsson, E.G. Massive MIMO with Cauchy Noise: Channel Estimation, Achievable Rate and Data Decoding. arXiv 2023, arXiv:2307.02724. [Google Scholar] [CrossRef]
Lev, O.; Khina, A. Energy-Limited Joint Source–Channel Coding of Gaussian Sources over Gaussian Channels with Unknown Noise Level. Entropy 2023, 25, 1522. [Google Scholar] [CrossRef]
Tao, C.; Liu, B. Online Autonomous Motion Control of Communication-Relay UAV with Channel Prediction in Dynamic Urban Environments. Drones 2024, 8, 771. [Google Scholar] [CrossRef]
Zhou, Z.; Chen, Z.; Wang, B.; Zhao, Y.; Lou, Y. Adaptive Tracking Method for Time-Varying Underwater Acoustic Channel Based on Dynamic Gaussian Window. J. Mar. Sci. Eng. 2024, 12, 2185. [Google Scholar] [CrossRef]
Tian, T.; Yang, K.; Wu, F.Y.; Zhang, Y. Channel Estimation for Underwater Acoustic Communications in Impulsive Noise Environments: A Sparse, Robust, and Efficient Alternating Direction Method of Multipliers-Based Approach. Remote Sens. 2024, 16, 1380. [Google Scholar] [CrossRef]
Mathur, A.; Bhatnagar, M.R.; Panigrahi, B.K. Maximum Likelihood Decoding of QPSK Signal in Power Line Communications Over Nakagami-m Additive Noise. In Proceedings of the 2015 IEEE International Symposium on Power Line Communications and Its Applications (ISPLC), Austin, TX, USA, 29 March–1 April 2015; pp. 7–12. [Google Scholar] [CrossRef]
Chaudhary, P.; Chauhan, I.; Manoj, B.R.; Bhatnagar, M.R. Linear Regression-Based Channel Estimation for Non-Gaussian Noise. In Proceedings of the 2024 IEEE 99th Vehicular Technology Conference (VTC2024-Spring), Singapore, 24–27 June 2024; pp. 1–6. [Google Scholar] [CrossRef]
Rong, J.; Zhang, J.; Duan, H. Robust Sparse Bayesian Learning Based on the Bernoulli–Gaussian Model. Digit. Signal Process. 2023, 136, 104013. [Google Scholar] [CrossRef]
Ohtsuki, T. Machine Learning in 6G Wireless Communications. IEICE Trans. Commun. 2023, E106-B, 75–83. [Google Scholar] [CrossRef]
Weththasinghe, K.; Jayawickrama, B.; He, Y. Machine Learning-Based Channel Estimation for 5G New Radio. IEEE Wirel. Commun. Lett. 2024, 13, 1133–1137. [Google Scholar] [CrossRef]
Balevi, E.; Doshi, A.; Andrews, J.G. Massive MIMO Channel Estimation with an Untrained Deep Neural Network. IEEE Trans. Wirel. Commun. 2020, 19, 2079–2090. [Google Scholar] [CrossRef]
Gao, J.; Zhong, C.; Li, G.Y.; Soriaga, J.B.; Behboodi, A. Deep Learning-Based Channel Estimation for Wideband Hybrid mmWave Massive MIMO. IEEE Trans. Commun. 2023, 71, 3679–3693. [Google Scholar] [CrossRef]
Kim, W.; Ahn, Y.; Kim, J.; Shim, B. Towards Deep Learning-Aided Wireless Channel Estimation and Channel State Information Feedback for 6G. J. Commun. Netw. 2023, 25, 61–75. [Google Scholar] [CrossRef]
Manoj, B.R.; Sadeghi, M.; Larsson, E.G. Adversarial Attacks on Deep Learning Based Power Allocation in a Massive MIMO Network. In Proceedings of the IEEE International Conference on Communications (ICC), Montreal, QC, Canada, 14–23 June 2021; pp. 1–6. [Google Scholar] [CrossRef]
Hu, Z.; Xue, J.; Li, F.; Zhao, Q.; Meng, D.; Xu, Z. Robust Channel Estimation Based on the Maximum Entropy Principle. Sci. China Inf. Sci. 2023, 66, 222304. [Google Scholar] [CrossRef]
Park, H.; Choi, J. Binary Hypothesis Testing-Based Low-Complexity Beamspace Channel Estimation for mmWave Massive MIMO Systems. arXiv 2025, arXiv:2508.01007. [Google Scholar] [CrossRef]
Khichar, S.; Santipach, W.; Wuttisittikulkij, L.; Parnianifard, A.; Chaudhary, S. Efficient Channel Estimation in OFDM Systems Using a Fast Super-Resolution CNN Model. J. Sens. Actuator Netw. 2024, 13, 55. [Google Scholar] [CrossRef]
Luan, D.; Thompson, J. Achieving Robust Channel Estimation Neural Networks. arXiv 2025, arXiv:2507.12630. [Google Scholar] [CrossRef]
Finamore, W.A.; da Silva Pinho, M.; Sharma, M.; Ribeiro, M.V. Modeling Noise as a Bernoulli–Gaussian Process. J. Commun. Inf. Syst. 2023, 38, 174–186. [Google Scholar] [CrossRef]
Le Saux, B.; Helard, M. Iterative Channel Estimation Based on Linear Regression for a MIMO-OFDM System. In Proceedings of the IEEE International Conference on Wireless and Mobile Computing, Networking and Communications (WiMob), Montreal, QC, Canada, 19–21 June 2006; pp. 356–361. [Google Scholar] [CrossRef]
Hodson, T.O.; Over, T.M.; Foks, S.S. Mean Squared Error, Deconstructed. J. Adv. Model. Earth Syst. 2021, 13, e2021MS002681. [Google Scholar] [CrossRef]
Deisenroth, M.P.; Faisal, A.A.; Ong, C.S. Mathematics for Machine Learning; Cambridge University Press: Cambridge, UK, 2020. [Google Scholar]
Hussein, W.; Audah, K.; Noordin, N.K.; Kraiem, H.; Flah, A.; Fadlee, M.; Ismail, A. Least Square Estimation-Based Different Fast Fading Channel Models in MIMO-OFDM Systems. Int. Trans. Electr. Energy Syst. 2023, 2023, 5547634. [Google Scholar] [CrossRef]

Figure 1. Block diagram for linear regression-based channel estimation.

Figure 2. MSE versus SNR curves for the proposed linear regression-based channel estimation with varying values of step size and residual threshold under Bernoulli-Gaussian noise.

Figure 3. Comparison of MSE versus SNR for Gaussian noise and Bernoulli-Gaussian noise for different variances.

Figure 4. Comparison of MSE versus SNR by using iterative gradient descent method by seeking derivatives from (16) and (19) for Bernoulli-Gaussian noise as a function of variances.

Figure 5. Convergence behavior of the proposed estimator, BG-MLE, compared with the LS baseline under Bernoulli–Gaussian noise. (a) Normalized excess negative log-likelihood versus iterations. (b) Normalized parameter MSE versus iterations.

Figure 6. Robustness of the proposed estimator, BG-MLE, compared with the LS baseline under Bernoulli–Gaussian noise. (a) Parameter MSE versus SNR for different impulsive variance ratios

r = N_{o 2} / N_{o 1}

at fixed impulse probability

p_{imp} = 0.2

. (b) Parameter MSE versus impulse probability

p_{imp}

at 10 dB SNR with

r = 15

.

Figure 6. Robustness of the proposed estimator, BG-MLE, compared with the LS baseline under Bernoulli–Gaussian noise. (a) Parameter MSE versus SNR for different impulsive variance ratios

r = N_{o 2} / N_{o 1}

at fixed impulse probability

p_{imp} = 0.2

. (b) Parameter MSE versus impulse probability

p_{imp}

at 10 dB SNR with

r = 15

.

Table 1. Simulation parameters for channel estimation.

Parameter	Symbol	Value/Range
Number of trials	M	1000 realizations
Noise variances	$N_{o 1}, N_{o 2}$	0.2, 0.6, 0.9
Step size (learning rate)	$γ$	$10^{- 4}$ to $10^{- 1}$
Residual threshold	$α$	$10^{- 4}$ to $10^{- 6}$
Signal-to-Noise Ratio (SNR)	—	$- 10$ dB to 30 dB
Performance metric	MSE	Averaged over trials

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chaudhary, P.; Manoj, B.R.; Chauhan, I.; Bhatnagar, M. Channel Estimation Using Linear Regression with Bernoulli–Gaussian Noise. Appl. Sci. 2025, 15, 10590. https://doi.org/10.3390/app151910590

AMA Style

Chaudhary P, Manoj BR, Chauhan I, Bhatnagar M. Channel Estimation Using Linear Regression with Bernoulli–Gaussian Noise. Applied Sciences. 2025; 15(19):10590. https://doi.org/10.3390/app151910590

Chicago/Turabian Style

Chaudhary, Prerna, B. R. Manoj, Isha Chauhan, and Manav Bhatnagar. 2025. "Channel Estimation Using Linear Regression with Bernoulli–Gaussian Noise" Applied Sciences 15, no. 19: 10590. https://doi.org/10.3390/app151910590

APA Style

Chaudhary, P., Manoj, B. R., Chauhan, I., & Bhatnagar, M. (2025). Channel Estimation Using Linear Regression with Bernoulli–Gaussian Noise. Applied Sciences, 15(19), 10590. https://doi.org/10.3390/app151910590

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Channel Estimation Using Linear Regression with Bernoulli–Gaussian Noise

Abstract

1. Introduction

2. System Model

2.1. Approximate Maximum-Likelihood Estimator Using Log-Sum Inequality

2.2. Proposed Maximum-Likelihood Estimator (Numerical Optimization)

3. Methodology

4. Results and Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Conflicts of Interest

Appendix A

Appendix A.1. Equivalence (Weighted-LS Form)

Appendix A.2. Sanity Checks

Appendix A.3. Numerical Stability (Log-Sum-Exp)

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI