Deep Neural Networks for Estimating Regularization Parameter in Sparse Time–Frequency Reconstruction

Jurdana, Vedran

doi:10.3390/technologies12120251

Open AccessArticle

Deep Neural Networks for Estimating Regularization Parameter in Sparse Time–Frequency Reconstruction

by

Vedran Jurdana

Department of Automation and Electronics, Faculty of Engineering, University of Rijeka, 51000 Rijeka, Croatia

Technologies 2024, 12(12), 251; https://doi.org/10.3390/technologies12120251

Submission received: 7 October 2024 / Revised: 28 November 2024 / Accepted: 30 November 2024 / Published: 1 December 2024

Download

Browse Figures

Versions Notes

Abstract

Time–frequency distributions (TFDs) are crucial for analyzing non-stationary signals. Compressive sensing (CS) in the ambiguity domain offers an approach for TFD reconstruction with high performance, but selecting the optimal regularization parameter for various signals remains challenging. Traditional methods for parameter selection, including manual and experimental approaches, as well as existing optimization procedures, can be imprecise and time-consuming. This study introduces a novel approach using deep neural networks (DNNs) to predict regularization parameters based on Wigner–Ville distributions (WVDs). The proposed DNN is trained on a comprehensive dataset of synthetic signals featuring multiple linear and quadratic frequency-modulated components, with variations in component amplitudes and random positions, ensuring wide applicability and robustness. By utilizing DNNs, end-users need only provide the signal’s WVD, eliminating the need for manual parameter selection and lengthy optimization procedures. Comparisons between the reconstructed TFDs using the proposed DNN-based approach and existing optimization methods highlight significant improvements in both reconstruction performance and execution time. The effectiveness of this methodology is validated on noisy synthetic and real-world signals, emphasizing the potential of DNNs to automate regularization parameter determination for CS-based TFD reconstruction in diverse signal environments.

Keywords:

time–frequency distribution; sparse reconstruction; regularization parameter; deep learning; neural network; compressive sensing

1. Introduction

Time–frequency distributions (TFDs) are essential tools for capturing the dynamic behavior of non-stationary signals across the time–frequency (TF) domain. While quadratic TFDs (QTFDs) are commonly used, they often introduce unwanted oscillatory artifacts, known as cross-terms, especially in signals with multiple components [1]. Although suppressing these cross-terms in the ambiguity function (AF) can be effective, it often compromises the quality of auto-terms, which are crucial components of the signal [1,2].

To address the drawbacks of conventional TF methods, various advanced and nonlinear methods have been developed, including the synchrosqueezing transform [3,4], synchroextracting transform [5], and sparse TF representations leveraging compressive sensing (CS) [6,7]. Among these, sparse TF representation is a commonly used method [8], with plenty of approaches for solving sparse representation problems [9,10]. This work focuses on TFD reconstruction using the AF, which provides a suitable framework to compressively sample non-stationary signals [7,11,12]. Recent studies have demonstrated the effectiveness of such sparse TFDs in achieving high resolution of the auto-terms while mitigating cross-terms and noise across diverse signals [6,7,13,14,15,16,17].

However, selecting the optimal regularization parameter in CS-based TFD reconstruction is a very delicate and computationally expensive task [7,11,15]. The regularization parameter basically needs to clear TFD from unwanted artifacts and leave only the signal’s auto-terms, where too low a value can result in unwanted reconstruction of interference, while too high a value can lead to the loss of auto-terms, i.e., overly sparse auto-term structures [12,14,15,18,19,20,21,22]. In practice, the exact positioning and distinction between auto-terms and cross-terms are often unknown, particularly in real-world signals with multiple, unpredictable components and noise. Over many years, the selection of this parameter has been handled manually or experimentally, a process that is subjective, imprecise, and time-consuming [11,15,23,24].

Recent research has sought to address this issue by developing measures to optimize sparse TFDs [12,18,21]. It has been shown that the loss of signal auto-terms complicates the use of global concentration and sparsity measures for evaluating and optimizing reconstructed TFDs, as they may artificially favor oversparse TFDs [12,18,21]. To be more precise, global measures consider the whole TFD (often vectorized) and fail to account for the local structure of auto-terms, often treating them the same as cross-terms. The limitation of the global approach has been solved in [18,19] by using the localized Rényi entropy (LRE) method [18,19,25,26]. In this method, auto-term reconstruction is evaluated by estimating the local number of components (i.e., auto-terms) before reconstruction and comparing it with estimates from the reconstructed TFD. Moreover, coupling the LRE-based measure with the global energy concentration measure [27] has led to the formulation of objective functions for multi-objective meta-heuristic optimization algorithms, such as particle swarm optimization (PSO) and the genetic algorithm (GA) [12,16,18,19]. This has offered a solution for a numerical assessment of reconstruction quality and automatic optimization of sparse TFD reconstruction algorithms.

Despite these advancements, LRE and meta-heuristic methods have limitations. Firstly, The LRE may be inaccurate for signals with significantly different component amplitudes, intersection points, or deviations from the reference component, potentially leading to the absence of auto-terms or the reconstruction of cross-terms and noise. Secondly, meta-heuristic optimizations begin stochastic in nature, do not guarantee finding the global solution, may get stuck in local optima, and often require numerous algorithm executions and evaluations.

To overcome these limitations, this paper aims to investigate the potential of modern deep learning networks (DNNs) for more efficient prediction of the regularization parameter. Although DNNs have not yet been applied to this specific problem, supervised learning methods have shown considerable potential in other signal processing applications. Bilevel learning approaches have been successful in obtaining optimal regularizers across various problems [28,29,30] as they aim to achieve optimal performance on average with respect to the training set. However, since this paper is not focused on a specific application, such approaches are less suitable because a single regularization parameter may not be adequate for diverse signals. Another class of supervised learning methods exploits DNNs such as convolutional neural networks (CNNs) and residual neural networks (ResNets) [31,32], which were initially used for post-processing improvements or image classification tasks [33]. Additionally, deep leaning methods have been employed to learn entire regularization functions [34,35,36] or to train a CNN to map observations to regularization parameters [37]. One recent approach in [8] uses a machine learning approach to predict the true quality metric from the approximated metrics obtained from reconstructed images using pre-selected regularization parameters. While this approach is suitable when limited experimental data are available, pre-selecting a regularization parameter may be inaccurate when research is not application-specific. The development of deep learning techniques has also led to several applications in TF signal analysis [38,39,40]. For instance, Jiang et al. [38] utilized a data-driven U-Net framework to reconstruct TF representation. An et al. [41] utilized DL for reconstruction of structural vibration responses. Miao et al. [42] utilized sparse TF representations and DL for classification problems of underwater acoustic signals. The works in [43,44], which use sparse TF analysis of seismic data, indicated the need for DL to improve the efficiency of determining regularization parameters.

To position this paper within the current research landscape, this paper presents a novel application of DNNs in CS-based sparse TFD reconstruction. Unlike previous works which focus on specific applications, in this approach, the proposed DNN is trained with a comprehensive set of synthetic signals composed of linear frequency-modulated (LFM) and quadratic FM (QFM) components, exhibiting randomly generated positions, amplitudes, and noise contamination. The choice of LFM and QFM components is motivated by studies indicating that most real-world signals exhibit such modulations [1,45,46]. This allows for the prediction of the regularization parameter for a wide range of signals, especially those with crossing components of different amplitudes, which pose challenges for existing meta-heuristic optimization approaches.

The main contribution of this paper is the development of a DNN-based technique with the following key properties:

The developed DNN predicts regularization parameters for a wide range of signals, both synthetic and real world, exhibiting LFM and QFM components.
For a new signal, the DNN-based regularization parameters can be obtained efficiently in an online phase, requiring only forward propagation through the DNN. This is significantly faster than the existing optimization approach, which may require multiple algorithm executions.
Numerical results show that in synthetic and real-world signal examples, DNN-computed regularization parameters lead to reconstructed TFDs with enhanced auto-term resolution and fewer cross-terms and noise samples compared to the existing optimization approach.
Training the DNN with noise improved the performance of the reconstructed TFDs in noisy conditions.

Note that the proposed DNN requires only the Wigner–Ville distribution (WVD) as input, eliminating the need for meta-heuristic optimization for each signal and addressing the limitations associated with LRE-based and concentration measures as objective functions in optimization. Once trained, the DNN can efficiently obtain regularization parameters through forward propagation.

It is acknowledged that providing the same regularization parameter to different reconstruction algorithms can yield varying quality of reconstructed TFDs [11,12]. This variation arises from differences in the algorithms which may also include additional parameters that influence the results. This study focuses on the augmented Lagrangian algorithm for

ℓ_{1}

(YALL1) [47] due to its slow execution time, which is more pronounced in meta-heuristic optimizations, and its tendency to produce TFDs with discontinuous auto-terms, leading to more cross-terms when optimized using the existing LRE-based approach. Although this research primarily addresses the YALL1 algorithm, it also lays the groundwork for designing DNNs for other reconstruction algorithms and CS-based approaches.

Therefore, this paper includes a comprehensive comparison between several DNNs with different architectures and complexities. The selected DNN model is then used for performance comparison of the TFDs reconstructed using the YALL1 algorithm with the proposed DNN-based regularization parameter versus those obtained using the LRE-based approach described in [12,18]. The evaluation covers both synthetic and real-world gravitational and electroencephalogram (EEG) seizure signals embedded in additive white Gaussian noise (AWGN).

The subsequent sections of this paper are organized as follows. Section 2 provides an overview of sparse TFD reconstruction and existing LRE-based measures, while Section 3 introduces the proposed methodology based on the DNN. Section 4 details the experimental simulation results, followed by the conclusion in Section 5.

2. Sparse Time-Frequency Distributions

Consider a non-stationary signal, denoted as

z (t)

, which represents the analytic form of a real signal

s (t)

, mathematically expressed as [1]:

z (t) = \sum_{m = 1}^{M} z_{m} (t) = \sum_{m = 1}^{M} A_{m} (t) e^{j φ_{m} (t)},

(1)

where M is the number of signal components,

z_{m} (t)

is the m-th signal component, while

A_{m} (t)

and

φ_{m} (t)

denote the m-th signal component’s instantaneous amplitude and phase, respectively.

The WVD serves as a fundamental TFD and is defined as [1]:

W_{z} (t, f) = \int_{- \infty}^{\infty} z (t + \frac{τ}{2}) z^{*} (t - \frac{τ}{2}) e^{- j 2 π f τ} d τ,

(2)

where

z^{*}

represents the complex conjugate of z, and f is the frequency variable.

Although the WVD provides precise instantaneous frequency estimates for signals with a single LFM component, it suffers from several limitations. This includes the presence of negative values, which complicate energy density interpretation, and the occurrence of interference, referred to as cross-terms. Cross-terms, which emerge as bilinear byproducts in multi-component signals, are positioned between any two components in the TF plane [1]. Interference can also be caused by noise, which varies depending on its source and type. This study specifically focuses on white noise, which is uniformly distributed across the TF plane. Additionally, noise usually appears as a background interference with no specific structure. Interference suppression is typically handled in the AF,

A_{z} (ν, τ)

, given as the 2D Fourier transform of the WVD [1]:

A_{z} (ν, τ) = \int_{- \infty}^{\infty} \int_{- \infty}^{\infty} W_{z} (t, f) e^{j 2 π (f τ - ν t)} d t d f .

(3)

This leads to the definition of QTFDs, denoted as

ρ_{z} (t, f)

:

ρ_{z} (t, f) = \int_{- \infty}^{\infty} \int_{- \infty}^{\infty} A_{z} (ν, τ) e^{j 2 π (ν t - f τ)} d ν d τ,

(4)

A_{z} (ν, τ) = A_{z} (ν, τ) g (ν, τ),

(5)

where

g (ν, τ)

is the low-pass filter kernel in the AF. The operation in (5) performs a multiplication of the original AF with the filter kernel, which can have various shapes tailored for specific signal types. The rationale behind the low-pass filtering in equation (5) is the typical positioning of cross-terms in the AF, which are generally located away from the AF origin due to their high oscillatory behavior in the TF plane. In contrast, auto-terms are positioned along trajectories that pass through the AF origin [1,46]. While traditional cross-term suppression through filtering, as seen in (5), helps eliminate cross-terms, it also removes some auto-term samples, often resulting in a reduction in auto-term resolution, which is crucial for accurate signal representation [1].

To overcome the inherent trade-off, sparsity constraints are introduced using CS, resulting in a sparse TFD,

Υ_{z} (t, f)

[7,11]. The CS-based approach in this study compressively samples the AF representation of the signal, focusing on a small subset of samples, namely the CS-AF area

A_{z}^{C S} (ν, τ)

, while the rest of the AF is calculated in order to obtain a high-performing sparse TFD. Among QTFDs, the WVD is chosen as the starting point due to its superior auto-term resolution, computational efficiency (requiring no additional tuning parameters), and an AF that includes all signal samples crucial for defining the CS-AF area [11,13,15,16]. The CS-AF area is usually centered around the AF origin and carefully selected to encompass only signal auto-terms. Ensuring that this area is correctly defined is crucial; otherwise, interference artifacts may reappear in the reconstructed TFD [14,15]. In this research, an adaptive rectangular CS-AF area of size

N_{τ}^{'} \times N_{ν}^{'}

is employed, with

N_{τ}^{'}

and

N_{ν}^{'}

being the numbers of lag and Doppler bins, respectively [11,14]. This area adaptively positions its boundaries near the first set of cross-terms, maximizing the inclusion of auto-term samples while minimizing computational requirements.

Given the smaller cardinality of

A_{z}^{C S} (ν, τ)

compared to

Υ_{z} (t, f)

(

N_{t} \times N_{f}

), where

N_{t}

and

N_{f}

are the numbers of time instances and frequency bins, respectively, the system is under-determined, allowing for multiple solutions of

Υ_{z} (t, f)

. The reconstruction algorithm seeks the optimal solution as described in [11,12]:

Υ_{z} (t, f) = Ψ^{H} \cdot A_{z}^{C S} (ν, τ),

(6)

where

Ψ^{H}

is the Hermitian transpose of a domain transformation matrix, analogous to the 2D Fourier transform akin to (3). To achieve this, a regularization function is needed to emphasize the desired attributes of the solution. This leads to an unconstrained optimization problem given as [14]:

{\hat{Υ}}_{z} (t, f) = arg min_{Υ_{z} (t, f)} \frac{1}{2} | | Υ_{z} (t, f) - Ψ^{H} A_{z}^{C S} (ν, τ) {| |}_{2}^{2} + λ c (Υ_{z} (t, f)),

(7)

where

c (Υ_{z} (t, f)) : R^{2} \to R

is the regularization function, while

λ > 0

is the regularization parameter. To promote sparsity, the

ℓ_{1}

norm is commonly used in the reconstruction process [2,7,11,13,14], which leads to the following unconstrained optimization problem [14]:

\begin{matrix} Υ_{z}^{ℓ_{1}} (t, f) = arg min_{Υ_{z} (t, f)} | | Υ_{z} (t, f) {| |}_{1}, \\ subject to : | | Υ_{z} (t, f) - Ψ^{H} A_{z}^{C S} (ν, τ) {| |}_{2}^{2} \leq ϵ, \end{matrix}

(8)

where

ϵ

denotes the user-defined solution tolerance, also referred to as the energy threshold, which governs the reconstruction accuracy by comparing the reconstructed TFDs from the current and previous iterations. The closed-form solution of (8) can be expressed as [14]:

Υ_{z}^{ℓ_{1}} (t, f) = {soft}_{λ} {Υ_{z} (t, f)},

(9)

where

{soft}_{λ} {Υ_{z} (t, f)}

is a soft-threshold function given as [14]:

{soft}_{λ} {Υ_{z} (t, f)} = sgn (Υ_{z} (t, f)) max (| Υ_{z} (t, f) | - λ, 0) .

(10)

The primary purpose of the

λ

parameter in this context is to suppress samples in the filtered TFD associated with interference while enhancing the prominence of the auto-terms. Note that this approach applies a single, uniform

λ

parameter consistently across the entire TFD.

Due to the popularity and effectiveness of

ℓ_{1}

-based minimization, a number of algorithms have emerged for solving (8) [7,11]. Considering its wide application and effectiveness in the various optimization problems, this paper focuses on the YALL1 algorithm given for the

(n + 1)

-th iteration [11,47]:

{[y (ν, τ)]}^{[n + 1]} = α_{YALL} Ψ {[z (t, f)]}^{[n]} - β_{YALL} (Ψ {[Υ_{z}^{ℓ_{1}} (t, f)]}^{[n]} - A_{z}^{C S} (ν, τ)),

(11)

{[z (t, f)]}^{[n + 1]} = {soft}_{\frac{λ}{μ_{YALL}}} \{Ψ^{H} {[y (ν, τ)]}^{[n + 1]} + \frac{1}{μ_{YALL}} {[Υ_{z}^{ℓ_{1}} (t, f)]}^{[n]}\},

(12)

{[Υ_{z}^{ℓ_{1}} (t, f)]}^{[n + 1]} = {[Υ_{z}^{ℓ_{1}} (t, f)]}^{[n]} + γ_{YALL} μ_{YALL} (Ψ^{H} {[y (ν, τ)]}^{[n + 1]} - {[z (t, f)]}^{[n + 1]}),

(13)

where

μ_{YALL} > 0

is the penalty parameter, while the parameters

α_{YALL}

,

β_{YALL}

, and

γ_{YALL}

are calculated as:

γ_{YALL} \in [0, (1 + \sqrt{5}) / 2]

,

α_{YALL} = μ_{YALL} / (μ_{YALL} + (1 / 2 λ))

, and

β_{YALL} = 1 / (μ_{YALL} + (1 / 2 λ))

. Since the CS-AF filtering tends to reduce the resolution of auto-terms, the primary goal of the YALL1 reconstruction algorithm is to iteratively enhance this resolution while maintaining suppression of interference components. The final reconstructed TFD is obtained when the solution tolerance,

ϵ

, is met or the maximum number of iterations,

N_{i t}

, has been reached [11,14].

2.1. Measuring Sparse TFDs: Existing Approaches

The optimal regularization parameter in sparse TFD reconstruction has usually been determined by a quantitative assessment of reconstructed TFD quality. Traditionally, global concentration and entropy measures, which generate a single value representing the entire TFD, have been used since they are computationally effective [1]. Specifically, one representative is the energy concentration measure,

M_{z}^{S}

, which is calculated as [27]:

M_{z}^{S} = \frac{1}{N_{t} N_{f}} {[\int_{- \infty}^{\infty} \int_{- \infty}^{\infty} {|ρ_{z_{N}} (t, f)|}^{\frac{1}{p_{s}}} d t d f]}^{p_{s}}, p_{s} > 1 \in N,

(14)

with

p_{s} = 2

being recommended [1,27]. A lower

M_{z}^{S}

indicates better TFD performance. Given that TFDs can be interpreted as two-dimensional probability density functions, the generalized Rényi entropy, denoted as

R (ρ_{z} (t, f))

, is used. It is defined as follows [48,49]:

R (ρ_{z} (t, f)) = \frac{1}{1 - α_{R}} {log}_{2} \int_{- \infty}^{\infty} \int_{- \infty}^{\infty} ρ_{z_{N}}^{α_{R}} (t, f) d t d f,

(15)

where

α_{R}

is typically an odd integer to mitigate cross-term effects. Lower R values also indicate better TFD quality. Note that in (14) and (15), QTFDs, defined by (4), are normalized with respect to their total energy [26,48]:

ρ_{z_{N}} (t, f) = \frac{ρ_{z} (t, f)}{\int_{- \infty}^{\infty} \int_{- \infty}^{\infty} ρ_{z} (t, f) d t d f} .

(16)

However, these global measures can be misleading for sparse TFDs because they may favor TFDs with fewer samples, falsely suggesting better performance by omitting important signal components [12,18,19,50]. To address this, the LRE method is used to recognize auto-terms, providing information about the local number of signal components. This involves comparing Rényi entropies of signal and reference components within each time and frequency slice, named as the short-term Rényi entropy (STRE) [26] and narrow-band Rényi entropy (NBRE) [18,19]:

M_{t}^{ρ_{z} (t, f)} (t_{0}) = 2^{R (χ_{t_{0}} {ρ_{z} (t, f)}) - R (χ_{t_{0}} {ρ_{ref} (t, f)})},

(17)

M_{f}^{ρ_{z} (t, f)} (f_{0}) = 2^{R (χ_{f_{0}} {ρ_{z} (t, f)}) - R (χ_{f_{0}} {ρ_{ref} (t, f)})},

(18)

where notations t and f indicate localization through time or frequency slices, respectively,

t_{0}

and

f_{0}

represent the observed time or frequency slice, respectively, while

ρ_{ref} (t, f)

is the reference QTFD. The operators

χ_{t_{0}}

and

χ_{f_{0}}

set to zero all TFD samples except those near

t_{0}

(or

f_{0}

) [18,26]:

χ_{t_{0}} {ρ_{z} (t, f)} = \{\begin{matrix} ρ_{z} (t, f), & t \in [t_{0} - Θ_{t} / 2, t_{0} + Θ_{t} / 2], \\ 0, & otherwise, \end{matrix}

(19)

χ_{f_{0}} {ρ_{z} (t, f)} = \{\begin{matrix} ρ_{z} (t, f), & f \in [f_{0} - Θ_{f} / 2, f_{0} + Θ_{f} / 2], \\ 0, & otherwise, \end{matrix}

(20)

where

Θ_{t}

and

Θ_{f}

control the window lengths [18,26]. Note that the chosen reference signal in the NBRE is perfectly time-localized and covers all frequencies, contrasting with the STRE reference signal [18,19,26].

The authors in [18] combined mean squared errors (MSEs) between the original and reconstructed TFDs by using the LRE method as [18]:

\begin{matrix} MSE = \frac{1}{N_{t}} \sum_{t = 1}^{N_{t}} {(\frac{M_{t}^{ρ_{z} (t, f)} (t) - M_{t}^{Υ_{z}^{ℓ_{1}} (t, f)} (t)}{max (M_{t}^{ρ_{z} (t, f)} (t), M_{t}^{Υ_{z}^{ℓ_{1}} (t, f)} (t))})}^{2} + \\ + \frac{1}{N_{f}} \sum_{f = 1}^{N_{f}} {(\frac{M_{f}^{ρ_{z} (t, f)} (f) - M_{f}^{Υ_{z}^{ℓ_{1}} (t, f)} (f)}{max (M_{f}^{ρ_{z} (t, f)} (f), M_{f}^{Υ_{z}^{ℓ_{1}} (t, f)} (f))})}^{2}, \end{matrix}

(21)

and formulated a state-of-the-art measure that can indicate if the TFD is overly sparse and lacks consistent auto-terms. Then, the optimal regularization parameter is determined by using the multi-objective optimization problem formulated as:

\min {M_{z}^{S}, MSE}, s . t . λ \in 〈 0, 10],

(22)

where

M_{z}^{S}

and MSE are minimized [18]. However, in the multi-objective optimization, improving one objective often degrades another [51,52,53,54]. This is also true for this use case, i.e., artificially the best

M_{z}^{S} = 0

is connected with the worst

MSE = 1

. Thus, the multi-objective algorithm constructs all feasible solutions in the Pareto front [51,52,53].

Meta-heuristic optimization algorithms are effective for non-linear optimization problems, handling non-differentiable, non-continuous, and non-convex objective functions. As in [18], the multi-objective particle swarm optimization method (MOPSO) [51,52,53] is utilized, which optimizes particle velocities,

{[V_{z}]}^{[n + 1]}

, and positions,

{[x_{z}]}^{[n + 1]} = [λ]

, based on individual and global best positions [51]:

\begin{matrix} {[V_{z}]}^{[n + 1]} & = w \cdot {[V_{z}]}^{[n]} + c_{1} r_{1} ({[P_{z}]}^{[n]} - {[x_{z}]}^{[n]}) + c_{2} r_{2} ({[G_{z}]}^{[n]} - {[x_{z}]}^{[n]}), \\ [x_{z}]^{[n + 1]} & = {[x_{z}]}^{[n]} + {[V_{z}]}^{[n + 1]}, \end{matrix}

(23)

where w is the inertia coefficient,

r_{1}

and

r_{2}

are random numbers in

[0, 1]

, and

c_{1}

and

c_{2}

are cognitive and social components [51,52]. MOPSO incorporates Pareto dominance for optimal solutions, stored in the repository. Furthermore, the MOPSO involves an additional mutation operator in each iteration controlled by the mutation rate

m_{r}

[51,52]. The final solution is selected using the fuzzy satisfying method [18]. This approach represents the current state-of-the-art technique for automatically optimizing regularization parameter, applied in various algorithms within CS-based sparse TFD reconstruction [16,18,21].

Limitations of the Current Meta-Heuristic Optimization Technique

Despite advantages, several limitations of the meta-heuristic approach [18] are addressed with this research. Objective functions of meta-heuristic optimization inherit limitations of the LRE method which can be imprecise for certain signal cases. Such signal examples are shown in Figure 1. For instance, an estimated local number of components exhibit drop at and in the vicinity of the crossing point in the TF plane of two or more components, as seen in Figure 1a,b. Furthermore, the LRE may not detect weaker component in a simple case with two parallel constant FM components, as seen in Figure 1c,d. In both cases, optimization can wrongly converge to an overly sparse TFD with lost auto-terms.

Meta-heuristic multi-objective optimization, despite its effectiveness, has several inherent limitations. A primary challenge is achieving an optimal balance between exploration and exploitation, which, if not adequately managed, can result in suboptimal solutions and slow convergence rates. Furthermore, these algorithms are typically computationally intensive, requiring substantial processing time and resources, especially when applied to complex, high-dimensional problems like the one addressed in this study. The performance of these algorithms is also highly sensitive to the selection of parameters and operators, which are often problem-specific. This specificity limits the generalizability of the approach and complicates the tuning process. Additionally, meta-heuristic methods generally lack a guarantee of identifying the true Pareto-optimal front, often resulting in approximations that may overlook critical solutions.

3. The Proposed Deep Neural Network-Based Approach for Determining Regularization Parameter

In this study, a novel approach is proposed for determining the regularization parameter

λ

by training a neural network to learn a mapping from observations to regularization parameters. The mapping is defined as

Φ : R^{m} \to R^{p}

, where an input vector

i \in R^{m}

(the observations) is mapped to a vector

λ \in R^{p}

(the regularization parameters):

λ = Φ (i) .

(24)

Here, the goal is to estimate the observation-to-regularization mapping function

Φ

by approximating it with a neural network and learning its parameters

θ

. DNNs are used for their universal approximation capabilities [37].

In this work, the DNN takes the WVD, represented in matrix form as

W_{z}

, which is calculated for input signal. Note that the WVD should be presented in its original form, without any post-processing. This choice offers several advantages. Firstly, the WVD can be computed efficiently for any signal and does not include any parameter, making it a straightforward input. Secondly, the WVD clearly delineates the spatial positions and amplitude discrepancies of auto-terms and cross-terms in the TF plane. Thirdly, using the WVD aligns well with the CS-based reconstruction method used in this work, which originates from its AF. A simplified block diagram of the proposed idea is illustrated in Figure 2.

The output of the DNN, denoted as

\hat{λ} (θ)

, is given by:

\hat{λ} (θ) = \hat{Φ} (W_{z}; θ) .

(25)

For training the DNN, we determine the optimal regularization parameter values, denoted as

λ_{opt}

, for each provided WVD by minimizing the

ℓ_{2}

norm between the ideal TFD,

ρ_{ideal}

, and the reconstructed TFD using the YALL1 algorithm:

λ_{opt} = \underset{λ}{arg min} | | Υ_{z} (λ) - ρ_{ideal} {| |}_{2} .

(26)

The training data, denoted as

{W_{z}^{(j)}, λ_{opt}^{(j)}}_{j = 1}^{J}

, comprises

J =

10,000 synthetic signal examples, both single and multi-component, expressed as a summation of M finite-duration signals:

z (t) = \sum_{m = 1}^{M} z_{m} (t) Π (\frac{t - \frac{t_{0_{m}} + t_{f_{m}}}{2}}{T_{m}}) = \sum_{m = 1}^{M} z_{m} (t) Π_{m} (t),

(27)

where

t_{0_{m}}

,

t_{f_{m}}

, and

T_{m}

denote the starting time, ending time, and duration of the m-th signal component, respectively. Here,

Π_{m} (t)

is a rectangular function defined as:

Π_{m} (t) = \{\begin{matrix} 1, & | t | \leq T_{m} / 2, \\ 0, & otherwise . \end{matrix}

(28)

Considering the common presence of LFM or QFM behavior in real-world signals [1,46,55], the m-th component

z_{m} (t)

embodies either an LFM or QFM behavior, expressed as:

z_{m}^{(LFM)} = A_{m} e^{2 j π (B_{m} t^{2} + f_{0_{m}} t)},

(29)

z_{m}^{(QFM)} = A_{m} e^{2 j π (D_{m} t^{3} + C_{m} t^{2} + f_{0_{m}} t)},

(30)

where

f_{0_{m}}

and

A_{m}

are the starting normalized frequency and amplitude, respectively, and

B_{m}

,

C_{m}

, and

D_{m}

are the frequency modulation rates. These parameters were randomly generated to encapsulate diverse variations of signal components across the entirety of the TFD, including instances of multiple intersections with varying amplitudes. Therefore,

t_{0_{m}}, t_{f_{m}} \in [1, N_{t}]

,

f_{0_{m}} \in [0, 0.5]

.

To account for real-world conditions, randomly selected signals were embedded into AWGN with a signal-to-noise (SNR) ratio as low as 0 dB, covering both noise-free and noisy scenarios. Additionally, the training signals were normalized relative to their maximum amplitude, necessitating the same normalization process when computing reconstructed TFDs for new signals using the regularization parameter predicted by the proposed DNN. As example, Figure 3 displays the WVDs, ideal TFDs and optimally reconstructed TFDs obtained using the YALL1 with

λ

optimized by (26) for two synthetic signals from the training set. Note that the WVDs serve as inputs to the DNN, while the optimal

λ_{opt} = 3.10

and

λ_{opt} = 1.60

are stored in the output vector

λ_{opt}

.

3.1. DNN Architectures

In this study, four DNN architectures are considered: a fully connected neural network (FCNN), ResNet, DenseNet, and a CNN with attention mechanisms. The FCNN uses a flattened layers, i.e., vectorized WVD as input, which simplifies the data representation and reduces computational complexity. However, this flattening process leads to a loss of spatial relationships of signal auto-terms and cross-terms within the TF plane, potentially hindering the model’s ability to capture patterns that are useful for accurate predictions. ResNets, with their skip connections, offer a solution by preserving spatial hierarchies and preventing the vanishing gradient problem in deep networks, making them highly effective in maintaining performance as the network depth increases. DenseNets further build on this by ensuring that each layer receives direct input from all preceding layers, promoting feature reuse and reducing the risk of overfitting. This architecture is particularly advantageous for tasks where capturing fine-grained spatial details is crucial. CNNs with attention mechanisms, such as Squeeze-and-Excitation (SE) blocks, enhance the model’s ability to focus on the most relevant regions of the matrix, dynamically adjusting to emphasize the most informative features. This makes attention-based CNNs particularly powerful for tasks requiring precise localization and analysis of patterns within large matrices. Overall, while FCNNs offer simplicity and computational efficiency, ResNets, DenseNets, and attention-based CNNs provide more sophisticated approaches that are better suited for capturing the complex spatial dependencies inherent in the TF plane, making them highly advantageous for tasks involving large matrix inputs. Table 1 outlines the architectures of each DNN used in this study.

The output layer (i.e., predicted

λ

) comprises a single neuron employing the ReLU activation function for regression predictions. To estimate

θ

, ADAM optimized with a dynamic learning rate of

[10^{- 5}, 10^{- 1}]

is utilized with the mean squared error loss function. The DNN models are trained on the input–output pairs for a maximum of 1000 epochs with a batch size of 32 and a validation split of 20%. To mitigate the risk of overfitting and ensure that proposed models generalize well to unseen data, an early stopping criterion is employed during training. Specifically, the early stopping mechanism monitors the validation loss and stops the training process if there is no improvement in the validation loss for 10 epochs. Additionally, it is ensured that the model reverts to the state where it achieved the best validation performance, rather than continuing to train and potentially overfitting to the training data.

3.2. Summary of the Proposed DNN-Based Approach

A general review of the proposed approach is outlined in Algorithm 1. In the offline phase, the training data are used to learn the DNN parameters. This involves computing multiple WVDs and reconstructed TFDs to find

λ_{opt}^{(j)}

for each synthetic signal. However, once the DNN is trained, forward propagation of the new signals through the network requires only the signal’s WVD. The obtained

λ

using the DNN is finally used in the YALL1 algorithm in order to compute reconstructed TFD.

Algorithm 1 Reconstructed TFDs obtained via DNN.

offline phase:

generate J training signals, following by their WVDs

W_{z}^{(j)}

and ideal TFDs

ρ_{ideal}^{(j)}

, for

j = 1, \dots, J

obtain

λ_{opt}^{(j)}

using (26)

use training data

{W_{z}^{(j)}, λ_{opt}^{(j)}}_{j = 1}^{J}

to compute DNN’s optimal parameters

\hat{θ}

online phase:

compute WVD of the considered signal

W_{z}

propagate

W_{z}

through the learned network to get

\hat{λ} (θ) = \hat{Φ} (W_{z}; \hat{θ})

compute the reconstructed TFD

{\hat{Υ}}_{z} (\hat{λ})

using YALL1

4. Experimental Results and Discussion

4.1. Experiment Setup

The proposed DNN-based approach performance was assessed on three synthetic signals with

N_{t} = 256

samples. Namely,

z_{S 1} (t)

composed of three LFM components with different amplitudes:

z_{S 1} (t) = e^{j 2 π (0.00039 t^{2})} + 0.5 e^{j 2 π (0.1 t + 0.00039 t^{2})} + 1.5 e^{j 2 π (0.2 t + 0.00039 t^{2})}

;

z_{S 2} (t)

composed of two crossing LFM components embedded in AWGN with

SNR = 4

dB:

z_{S 2} (t) = 0.9 e^{j 2 π (0.1 t + 0.00059 t^{2})} + e^{j 2 π (0.45 t - 0.00049 t^{2})}

; and

z_{S 3} (t)

composed of one LFM and one sinusoidal component embedded in AWGN with

SNR = 5

dB:

z_{S 3} (t) = e^{j 2 π (0.3 t + 0.0002 t^{2})} + e^{j (0.38 π t + 23.04 (sin (0.0245 t - π / 2) + 1))}

. Furthermore, the experiment was extended with real-world gravitational (this research has made use of data, software, and/or web tools obtained from the LIGO Open Science Center (https://losc.ligo.org), a service of LIGO Laboratory and the LIGO Scientific Collaboration. LIGO is funded by the U.S. National Science Foundation),

z_{G} (t)

[12,16,56,57], and EEG seizure signal,

z_{EEG} (t)

[12,50,58,59,60,61,62,63] as examples. For

z_{G} (t)

, consisting of 3441 samples, downsampling by a factor of 14 resulted in

N_{t} = 256

samples, corresponding to a duration of 0.25 to 0.45 seconds and a frequency range of

[0, 512]

Hz. For

z_{EEG} (t)

, a differentiator filter was applied to enhance its spike signatures as in [60,61,64,65]. Figure 4 illustrates the WVDs of the analyzed signals, highlighting the presence of cross-terms and noise contamination.

The CS-AF areas,

N_{τ}^{'} \times N_{ν}^{'}

, for the considered signals

z_{S 1} (t)

,

z_{S 2} (t)

,

z_{S 3} (t)

,

z_{G} (t)

and

z_{EEG} (t)

were calculated as

89 \times 15

,

17 \times 15

,

27 \times 15

,

35 \times 19

, and

15 \times 15

, respectively. For the MOPSO, the parameters were configured as follows: a maximum of 50 iterations, a population size of 25, and 25 particles in the repository. The coefficients were set to

w = 0.9, c_{1} = c_{2} = 2, m_{r} = 0.1

, as used in [18]. The reconstruction algorithm parameters were set as

N_{i t} = 200

and

ϵ = 10^{- 3}

following the recommendations in [11,12,14,16,18,19]. The LRE computation used

α_{R} = 3

and

Θ_{t} = Θ_{f} = 11

as recommended in [12,21,26,66]. The simulations of the algorithms’ execution times, averaged over 1000 independent runs, have been performed on a PC with the Ryzen 7 3700X @ 3.60GHz Base Clock processor and 32GB of DDR4 RAM.

The reconstruction performance has been evaluated using machine learning evaluation metrics [67]. The use of these metrics has proven effective in image and signal processing, as demonstrated in previous studies [55,68,69,70]. To apply these metrics, the reconstructed and ideal TFDs were converted into a binary classification framework. Here, the positive class (P) denotes the presence of signal auto-term samples in the TFD, whereas the negative class (N) indicates the absence of signal. The ideal TFD contains true samples that should precisely match their positions and classifications in the reconstructed TFD. Consequently, the following metrics were employed: true positives (TPs)—samples correctly identified as the positive class, true negatives (TNs)—samples correctly identified as the negative class, false positives (FPs)—samples incorrectly categorized in the positive class, and false negatives (FNs)—samples incorrectly categorized in the negative class [67]. A high FP value suggests the inclusion of noise and/or cross-term-related samples in the reconstruction, while a high FN value indicates missing true signal samples in the reconstructed TFD.

These metrics are visually summarized in a confusion matrix [67,71]. Based on the confusion matrix and its components, several statistical measures are calculated [67]:

Accuracy = \frac{TP + TN}{TP + TN + FP + FN},

(31)

Precision = \frac{TP}{TP + FP},

(32)

Recall = \frac{TP}{TP + FN},

(33)

F 1 score = 2 \cdot \frac{Precision \cdot Recall}{Precision + Recall} = \frac{2 \cdot TP}{2 \cdot TP + FP + FN} .

(34)

All these statistical metrics range from 0 to 1, with higher values indicating better performance.

4.2. Performance of DNN Architectures

Considering the MAE metric provided in Table 2, the CNN with attention achieves the lowest validation MAE of 0.0956, reflecting its superior ability to generalize to unseen data. In comparison, DenseNet and ResNet follow with validation MAEs of 0.1026 and 0.1011, respectively, demonstrating competitive generalization capabilities. The FCNN, with the highest validation MAE of 0.1318, underperforms due to the loss of spatial information resulting from the input WVD being flattened. Furthermore, the FCNN reached overfitting more quickly, completing training in 90 epochs, while ResNet, DenseNet, and CNN with attention utilized 105, 155, and 170 epochs, respectively. Training times for FCNN, ResNet, DenseNet, and CNN with attention are 2.85, 3.53, 4.56, and 26.73 hours, respectively. For tasks where prediction accuracy is critical and resources are available, the CNN with attention is the best choice due to the best validation MAE. Otherwise, ResNet and DensNet provide better balance between architecture complexity and performance. Consequently, the ResNet (further in the text as the proposed DNN) is selected in this study for further analysis due to its lower training time with competitive training and validation MAEs.

4.3. Comparison with the Current Meta-Heuristic Optimization Approach

The performance of the DNN in predicting regularization parameters compared to those obtained using the existing MOPSO approach is visualized in Figure 5. Specifically, the scatter plot for the DNN (see Figure 5b) indicates a better correlation between the optimal and predicted regularization parameters compared to the MOPSO predictions (see Figure 5a). This improvement is further illustrated in the density plots for the discrepancy

λ - λ_{opt}

. The density plot for the DNN, shown in Figure 5d, reveals that most discrepancies between the optimal and DNN-predicted regularization parameters are within

\pm 1

. In contrast, the density plot for the MOPSO, shown in Figure 5c, exhibits roughly equal probabilities for discrepancies up to

\pm 3

, with only a slight decrease for discrepancies in continuation. This suggests that MOPSO optimization using the LRE is less reliable, potentially becoming trapped in local optima or converging to a suboptimal TFD reconstruction due to LRE limitations.

Figure 6 and Figure 7 present the reconstructed TFDs obtained using the YALL1 with DNN-based versus LRE-based MOPSO

λ

parameter. Across all considered synthetic and real-world signal examples, the DNN-based reconstructed TFDs demonstrate performance competitive with those obtained via MOPSO. Notably, the DNN-based reconstructed TFDs maintain high-resolution auto-terms while effectively suppressing cross-terms and noise, visually outperforming the MOPSO-based reconstructed TFDs. Given that auto-terms reconstructed by the YALL1 often appear discontinuous, the LRE-based MOPSO approach [18] tends to mitigate this discontinuity but at a cost. To maintain the computed number of components in each time and frequency slice, MOPSO optimization (i) reduces the resolution of auto-terms and (ii) leaves cross-terms or noise samples to fill in the gaps, as illustrated in Figure 6b,c. Furthermore, when the LRE fails to accurately detect components with low amplitude, the reconstructed TFD does not preserve these components well, as seen in Figure 6a.

Additionally, the auto-terms of real-world signals

z_{G} (t)

and

z_{EEG} (t)

are not fully preserved when using MOPSO, as depicted in Figure 7. This occurs because of two reasons. First, the LRE does not detect auto-terms with lower amplitude. Second, more samples, even those related to auto-terms, artificially degrade the objective

M_{z}^{S}

measure, consequently favoring oversparse reconstructed TFDs. By visually inspecting reconstructed TFDs with

λ

predicted by the DNN in Figure 7, they exhibit better preservation of auto-terms, indicating that the training data used in this study can be used for analyzing unseen real-world signals featuring LFM and/or QFM components, as in

z_{G} (t)

and

z_{EEG} (t)

.

The numerical performance comparison of the DNN-based and LRE-based MOPSO reconstructed TFDs (shown in Figure 6) with the ideal TFDs are presented in Table 3. Overall, the results validate the visual observations mentioned above. For the signal

z_{S 1} (t)

, the DNN-based reconstructed TFD achieves better preservation of the auto-terms with lower amplitude than the MOPSO-based one. This is indicated by higher TP and lower FN indices in the confusion matrix (and consequently higher recall), as more auto-term samples are correctly predicted with fewer samples falsely set to zero. On the other hand, given a higher

λ

, the MOPSO-based reconstructed TFDs exhibit higher TN and lower FP, achieving higher accuracy and precision metrics.

The opposite behavior can be observed for noise signals

z_{S 2} (t)

and

z_{S 3} (t)

. Reconstruction of cross-term and noise-related samples using the MOPSO is indicated by a significant increase in FP and a decrease in TN indices compared with reconstruction using the DNN-predicted

λ

, which is also evident from lower accuracy and precision metrics. Consequently, in this case, the lower

λ

obtained using MOPSO preserves more auto-term-related samples, as indicated by higher TP and lower FN indices, and higher recall metric. For all three synthetic signal examples, the DNN-predicted reconstructed TFDs exhibit superior F1 scores.

Some of the reported precision, recall, and F1 scores are relatively low, which can be attributed to several factors. In an ideal TFD that contains only auto-term maxima, positive samples represent a very small proportion of the total, resulting in a highly imbalanced dataset where the majority of samples are zero. This imbalance diminishes the reliability of accuracy as a standalone evaluation metric, highlighting the importance of using precision, recall, and F1 score metrics for a more comprehensive and reliable assessment. Several factors influence the reported metrics. While the YALL1 algorithm significantly enhances auto-term resolution and effectively suppresses interference, the reconstructed auto-term resolution is still less sharp than the ideal, and its trajectory may exhibit discontinuities. These discontinuities, while not affecting the extraction of useful information, can impact classification metrics. Additionally, the reconstruction process and initial interference may introduce slight biases in the positions of auto-term maxima within the reconstructed TFD, leading to deviations from their ideal locations. Another contributing factor is the presence of random TF samples with negligible amplitude in the TFD. If not thresholded, these insignificant samples can artificially lower the evaluation metrics. In this study, the focus has been on evaluating reconstructed TFDs without any post-processing, such as thresholding or over/under-sampling of the data. Future research will explore suitable post-processing methods and metrics to improve the performance and robustness of these classification metrics for this use case.

4.4. Computational Complexity

FCNN, ResNet, DenseNet, and CNN with attention consist 67,544,065, 4,908,673, 206,261, and 34,876,001 parameters, from which 67,540,481, 4,902,913, 201,853 and 34,872,545 are trainable, respectively. Calculating the full computational complexities of neural networks involves considering the number of operations required in each layer. These complexities are typically expressed in terms of the number of floating-point operations (FLOPs), which can be calculated based on the architecture of each network. The FCNN is the least intensive at approximately 67.8 million FLOPs due to its simpler architecture. In contrast, ResNet and DenseNet offer a balance between complexity and performance, with complexities of around 849.3 million FLOPs and 471.8 million FLOPs, respectively, making DenseNet slightly more efficient. The CNN with attention has the highest complexity at approximately 1.227 billion FLOPs due to the additional attention mechanisms.

The main motivation for using the DNN is in reducing the execution time for obtaining

λ

in the online phase, which is significantly lower than performing full meta-heuristic optimization. As reported in Table 3, obtaining optimal

λ

can take up to 7911 s using MOPSO, while the trained DNN gives its prediction in approximately 0.06 s, which is significantly faster. Indeed, execution time of the MOPSO can be improved by decreasing the numbers of particles and iterations. However, that may lead to more inaccurate

λ

values and consequently reconstructed TFDs with worse performance. As it was intended in [12,18,50], meta-heuristic optimization should be used for offline signal analysis, while the proposed DNN approach is a perspective to be used even for online analysis.

4.5. Noise Sensitivity Analysis

To evaluate the impact of noise on the prediction of the

λ

parameter by the proposed DNN, a comprehensive comparative analysis was conducted using precision, recall, and F1 score metrics. These metrics were derived from the comparison between the ideal TFDs and the noisy reconstructed TFDs obtained using DNN- or MOPSO-predicted

λ

. The synthetic signals were subjected to AWGN across four SNR levels ranging from 9 dB to 0 dB. The results, based on 1000 independent noise realizations, are summarized in Table 4. The findings underscore the superiority of the DNN reconstructions over the MOPSO reconstructions for all synthetic signal examples. This superiority is discerned through higher F1 scores and precision metrics, particularly notable at lower SNR values.

4.6. Limitations of the Proposed Approach

While the proposed DNN approach exhibits strong predictive capabilities for estimating the regularization parameter in sparse time–frequency distributions of non-stationary signals, it does have certain limitations. The model’s effectiveness is sensitive to the type of noise present in the signals. For instance, in scenarios involving specific noise types such as impulsive or colored noise, retraining the network may be necessary to achieve optimal performance, as the current model is primarily designed for signals corrupted by AWGN. Additionally, the model may experience degraded performance when analyzing real-world signals that do not exhibit LFM or QFM components.

Moreover, although this network has been specifically trained using the YALL1 algorithm for sparse reconstruction, its deep learning architecture is sufficiently flexible to allow for straightforward retraining with other reconstruction algorithms. This adaptability enables the model to incorporate additional parameters beyond the regularization parameter

λ

, potentially enhancing its utility across a broader range of applications.

5. Conclusions

This study introduces a novel approach utilizing DNNs for predicting the regularization parameter in CS-based reconstruction of TFDs. By training the DNNs on synthetic signals composed of LFM and QFM components, the proposed method aims to eliminate the need for manual parameter selection or complex optimization procedures, which are often time-consuming, require specialist knowledge, and depend on appropriate objective functions.

The results demonstrate the efficacy of the DNN-based approach in automatically determining the regularization parameter, showing competitive performance compared to existing optimization methods in terms of reconstruction quality. Specifically, the DNN-based reconstructions better preserve auto-terms with low amplitude, exhibit high-resolution auto-terms, and effectively suppress cross-terms and noise. This has also been validated by real-world gravitational signal examples for which the proposed DNN was not specifically trained. Additionally, the DNN-based reconstructions are less sensitive to noise.

Two additional advantages of using DNNs over meta-heuristic optimization are emphasized. First, the end-user only needs to provide the signal’s WVD as input to the network, circumventing the limitations and parameter tuning of additional methods such as LRE and MOPSO. Second, obtaining the

λ

parameter using a trained DNN is significantly faster than performing meta-heuristic optimization, suggesting that DNNs can be used for online application purposes.

Overall, this study highlights the potential of DNNs in automating the determination of regularization parameters for TFD reconstruction, paving the way for enhanced signal processing applications in diverse environments. Future research could explore the possibility of constructing the training data with application-specific signal types and noise conditions, as well as applying DNNs across different reconstruction algorithms and approaches. Furthermore, future research will include the development of adaptive regularization parameter using DNNs.

Funding

This research was funded by the University of Rijeka under the project number uniri-mladi-tehnic-23-2.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The author declares no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

AF	Ambiguity function
AWGN	Additive white Guassian noise
CNN	Convolutional neural network
CS	Compressive sensing
DNN	Deep neural network
EEG	Electroencephalogram
FCNN	Fully connected neural network
FLOP	Floating-point operation
FN	False negative
FP	False positive
GA	Genetic algorithm
LFM	Linear frequency-modulation
LRE	Local Rényi entropy
MOPSO	Multi-objective particle swarm optimization
MSE	Mean squared error
NBRE	Narrow-band Rényi entropy
PSO	Particle swarm optimization
QFM	Quadratic frequency-modulated
QTFD	Quadratic time–frequency distribution
ResNet	Residual network
SE	Squeeze-and-Excitation
SNR	Signal-to-noise ratio
STRE	Short-term Rényi entropy
TF	Time–frequency
TFD	Time–frequency distribution
TN	True negative
TP	True positive
WVD	Wigner–Ville distribution
YALL1	Augmented Lagrangian algorithm for $ℓ_{1}$

References

Boashash, B. Time-Frequency Signal Analysis and Processing, A Comprehensive Reference, 2nd ed.; EURASIP and Academic Press Series in Signal and Image Processing; Elsevier: London, UK, 2016. [Google Scholar]
Stankovic, L.; Dakovic, M.; Thayaparan, T. Time-Frequency Signal Analysis with Applications; Artech House Publishers: Boston, MA, USA, 2013. [Google Scholar]
Daubechies, I.; Lu, J.; Wu, H.T. Synchrosqueezed wavelet transforms: An empirical mode decomposition-like tool. Appl. Comput. Harmon. Anal. 2011, 30, 243–261. [Google Scholar] [CrossRef]
Liu, N.; Gao, J.; Zhang, Z.; Jiang, X.; Lv, Q. High-resolution characterization of geologic structures using the synchrosqueezing transform. Interpretation 2017, 5, T75–T85. [Google Scholar] [CrossRef]
Yu, G.; Yu, M.; Xu, C. Synchroextracting Transform. IEEE Trans. Ind. Electron. 2017, 64, 8042–8054. [Google Scholar] [CrossRef]
Gholami, A. Sparse time–frequency decomposition and some applications. IEEE Trans. Geosci. Remote Sens. 2013, 51, 3598–3604. [Google Scholar] [CrossRef]
Sejdić, E.; Orović, I.; Stanković, S. Compressive sensing meets time-frequency: An overview of recent advances in time-frequency processing of sparse signals. Digit. Signal Process. 2018, 77, 22–35. [Google Scholar] [CrossRef]
Guan, Y.; Li, Y.; Ke, Z.; Peng, X.; Liu, R.; Li, Y.; Du, Y.P.; Liang, Z.P. Learning-Assisted Fast Determination of Regularization Parameter in Constrained Image Reconstruction. IEEE Trans. Biomed. Eng. 2024, 71, 2253–2264. [Google Scholar] [CrossRef]
Zhang, Z.; Xu, Y.; Yang, J.; Li, X.; Zhang, D. A survey of sparse representation: Algorithms and applications. IEEE Access 2015, 3, 490–530. [Google Scholar] [CrossRef]
Yuan, S.; Ji, Y.; Shi, P.; Zeng, J.; Gao, J.; Wang, S. Sparse Bayesian Learning-Based Seismic High-Resolution Time-Frequency Analysis. IEEE Geosci. Remote Sens. Lett. 2019, 16, 623–627. [Google Scholar] [CrossRef]
Volarić, I. Signal Concentration Enhancement in the Time-Frequency Domain Using Adaptive Compressive Sensing. Ph.D. Thesis, Faculty of Engineering, University of Rijeka, Rijeka, Croatia, 2017. [Google Scholar]
Jurdana, V. A Multi-Objective Optimization Procedure for Locally Adaptive Time-Frequency Analysis with Application in EEG Signal Processing. Ph.D. Thesis, Faculty of Engineering, University of Rijeka, Rijeka, Croatia, 2023. [Google Scholar]
Stanković, L.; Orović, I.; Stanković, S.; Amin, M. Compressive sensing based separation of nonstationary and stationary signals overlapping in time-frequency. IEEE Trans. Signal Process. 2013, 61, 4562–4572. [Google Scholar] [CrossRef]
Volaric, I.; Sucic, V.; Stankovic, S. A Data Driven Compressive Sensing Approach for Time-Frequency Signal Enhancement. Signal Process. 2017, 141, 229–239. [Google Scholar] [CrossRef]
Flandrin, P.; Borgnat, P. Time-frequency energy distributions meet compressed sensing. IEEE Trans. Signal Process. 2010, 58, 2974–2982. [Google Scholar] [CrossRef]
Jurdana, V.; Lopac, N.; Vrankic, M. Sparse Time-Frequency Distribution Reconstruction Using the Adaptive Compressed Sensed Area Optimized with the Multi-Objective Approach. Sensors 2023, 23, 4148. [Google Scholar] [CrossRef] [PubMed]
Gramfort, A.; Strohmeier, D.; Haueisen, J.; Hämäläinen, M.S.; Kowalski, M. Time-frequency mixed-norm estimates: Sparse M/EEG imaging with non-stationary source activations. NeuroImage 2013, 70, 410–422. [Google Scholar] [CrossRef] [PubMed]
Jurdana, V.; Volaric, I.; Sucic, V. Sparse time-frequency distribution reconstruction based on the 2D Rényi entropy shrinkage algorithm. Digit. Signal Process. 2021, 118, 103225. [Google Scholar] [CrossRef]
Jurdana, V.; Volaric, I.; Sucic, V. The local Rényi entropy based shrinkage algorithm for sparse TFD reconstruction. In Proceedings of the 2020 International Conference on Broadband Communications for Next Generation Networks and Multimedia Applications (CoBCom), Graz, Austria, 7–9 July 2020; pp. 1–8. [Google Scholar] [CrossRef]
Jurdana, V.; Volaric, I.; Sucic, V. A sparse TFD reconstruction approach using the S-method and local entropies information. In Proceedings of the 2021 12th International Symposium on Image and Signal Processing and Analysis (ISPA), Zagreb, Croatia, 13–15 September 2021; pp. 4–9. [Google Scholar] [CrossRef]
Jurdana, V. Local Rényi entropy-based Gini index for measuring and optimizing sparse time-frequency distributions. Digit. Signal Process. 2024, 147, 104401. [Google Scholar] [CrossRef]
Jurdana, V.; Volaric, I.; Sucic, V. Application of the 2D local entropy information in sparse TFD reconstruction. In Proceedings of the 2022 International Conference on Broadband Communications for Next Generation Networks and Multimedia Applications (CoBCom), Graz, Austria, 12–14 July 2022; pp. 1–7. [Google Scholar] [CrossRef]
Huang, Y.; Bao, W.; Li, F.; Li, X. Sparse Time-Frequency Analysis Based on Instantaneous Frequency Estimation and Fault Diagnosis Application. IEEE Trans. Instrum. Meas. 2022, 71, 1–9. [Google Scholar] [CrossRef]
Tong, C.; Wang, S.; Selesnick, I.; Yan, R.; Chen, X. Ridge-Aware Weighted Sparse Time-Frequency Representation. IEEE Trans. Signal Process. 2021, 69, 136–149. [Google Scholar] [CrossRef]
Sucic, V.; Saulig, N.; Boashash, B. Estimating the number of components of a multicomponent nonstationary signal using the short-term time-frequency Rényi entropy. EURASIP J. Adv. Signal Process. 2011, 2011, 125. [Google Scholar] [CrossRef]
Sucic, V.; Saulig, N.; Boashash, B. Analysis of local time-frequency entropy features for nonstationary signal components time supports detection. Digit. Signal Process. 2014, 34, 56–66. [Google Scholar] [CrossRef]
Stanković, L. A measure of some time–frequency distributions concentration. Signal Process. 2001, 81, 621–631. [Google Scholar] [CrossRef]
Chung, J.; Chung, M.; O’Leary, D.P. Designing Optimal Spectral Filters for Inverse Problems. SIAM J. Sci. Comput. 2011, 33, 3132–3152. [Google Scholar] [CrossRef]
Chung, J.; Chung, M.; O’Leary, D.P. Optimal Filters from Calibration Data for Image Deconvolution with Data Acquisition Error. J. Math. Imaging Vis. 2012, 44, 366–374. [Google Scholar] [CrossRef]
Antil, H.; Di, Z.W.; Khatri, R. Bilevel optimization, deep learning and fractional Laplacian regularization with applications in tomography. Inverse Probl. 2020, 36, 064001. [Google Scholar] [CrossRef]
Lucas, A.; Iliadis, M.; Molina, R.; Katsaggelos, A.K. Using Deep Neural Networks for Inverse Problems in Imaging: Beyond Analytical Methods. IEEE Signal Process. Mag. 2018, 35, 20–36. [Google Scholar] [CrossRef]
McCann, M.T.; Jin, K.H.; Unser, M. Convolutional Neural Networks for Inverse Problems in Imaging: A Review. IEEE Signal Process. Mag. 2017, 34, 85–95. [Google Scholar] [CrossRef]
Zhang, K.; Zuo, W.; Gu, S.; Zhang, L. Learning Deep CNN Denoiser Prior for Image Restoration. arXiv 2017, arXiv:cs.CV/1704.03264. [Google Scholar]
Vito, E.D.; Fornasier, M.; Naumova, V. A Machine Learning Approach to Optimal Tikhonov Regularisation I: Affine Manifolds. arXiv 2017, arXiv:math.NA/1610.01952. [Google Scholar]
Hammernik, K.; Klatzer, T.; Kobler, E.; Recht, M.P.; Sodickson, D.K.; Pock, T.; Knoll, F. Learning a variational network for reconstruction of accelerated MRI data. Magn. Reson. Med. 2018, 79, 3055–3071. [Google Scholar] [CrossRef]
Wang, H.P.; Peng, W.H.; Ko, W.J. Learning priors for adversarial autoencoders. APSIPA Trans. Signal Inf. Process. 2020, 9, e4. [Google Scholar] [CrossRef]
Afkham, B.M.; Chung, J.; Chung, M. Learning regularization parameters of inverse problems via deep neural networks. Inverse Probl. 2021, 37, 105017. [Google Scholar] [CrossRef]
Jiang, L.; Zhang, H.; Yu, L. Robust Time-Frequency Reconstruction by Learning Structured Sparsity. arXiv 2020, arXiv:eess.SP/2004.14820. [Google Scholar]
Pan, P.; Zhang, Y.; Deng, Z.; Fan, S.; Huang, X. TFA-Net: A Deep Learning-Based Time-Frequency Analysis Tool. IEEE Trans. Neural Netw. Learn. Syst. 2023, 34, 9274–9286. [Google Scholar] [CrossRef] [PubMed]
Yang, Y.; Lei, Y.; Liu, N.; Wang, Z.; Gao, J.; Ding, J. SparseTFNet: A Physically Informed Autoencoder for Sparse Time–Frequency Analysis of Seismic Data. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–12. [Google Scholar] [CrossRef]
An, Y.; Xue, Z.; Ou, J. Deep learning-based sparsity-free compressive sensing method for high accuracy structural vibration response reconstruction. Mech. Syst. Signal Process. 2024, 211, 111168. [Google Scholar] [CrossRef]
Miao, Y.; Zakharov, Y.V.; Sun, H.; Li, J.; Wang, J. Underwater Acoustic Signal Classification Based on Sparse Time–Frequency Representation and Deep Learning. IEEE J. Ocean. Eng. 2021, 46, 952–962. [Google Scholar] [CrossRef]
Liu, N.; Zhang, Y.; Lei, Y.; Yang, Y.; Wang, Z.; Gao, J.; Jiang, X. Seismic Sparse Time-Frequency Network With Transfer Learning. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–13. [Google Scholar] [CrossRef]
Liu, N.; Lei, Y.; Liu, R.; Yang, Y.; Wei, T.; Gao, J. Sparse Time–Frequency Analysis of Seismic Data: Sparse Representation to Unrolled Optimization. IEEE Trans. Geosci. Remote Sens. 2023, 61, 1–10. [Google Scholar] [CrossRef]
Boashash, B.; Ouelha, S. An improved design of high-resolution quadratic time–frequency distributions for the analysis of nonstationary multicomponent signals using directional compact kernels. IEEE Trans. Signal Process. 2017, 65, 2701–2713. [Google Scholar] [CrossRef]
Boashash, B.; Jawad, B.K.; Ouelha, S. Refining the ambiguity domain characteristics of non-stationary signals for improved time–frequency analysis: Test case of multidirectional and multicomponent piecewise LFM and HFM signals. Digit. Signal Process. 2018, 83, 367–382. [Google Scholar] [CrossRef]
Yang, J.; Zhang, Y. Alternating direction algorithms for ℓ₁-problems in compressive sensing. SIAM J. Sci. Comput. 2011, 33, 250–278. [Google Scholar] [CrossRef]
Baraniuk, R.G.; Flandrin, P.; Janssen, A.J.E.M.; Michel, O.J.J. Measuring time-frequency information content using the Rényi entropies. IEEE Trans. Inf. Theory 2001, 47, 1391–1409. [Google Scholar] [CrossRef]
Aviyente, S.; Williams, W.J. Minimum entropy time-frequency distributions. IEEE Signal Process. Lett. 2005, 12, 37–40. [Google Scholar] [CrossRef]
Jurdana, V.; Vrankic, M.; Lopac, N.; Jadav, G.M. Method for automatic estimation of instantaneous frequency and group delay in time-frequency distributions with application in EEG seizure signals analysis. Sensors 2023, 23, 4680. [Google Scholar] [CrossRef] [PubMed]
Durillo, J.J.; García-Nieto, J.; Nebro, A.J.; Coello, C.A.C.; Luna, F.; Alba, E. Multi-objective particle swarm optimizers: An experimental comparison. In Evolutionary Multi-Criterion Optimization; Springer: Berlin/Heidelberg, Germany, 2009; pp. 495–509. [Google Scholar] [CrossRef]
He, X.; Fu, X.; Yang, Y. Energy-efficient trajectory planning algorithm based on multi-objective PSO for the mobile sink in wireless sensor networks. IEEE Access 2019, 7, 176204–176217. [Google Scholar] [CrossRef]
Deb, K.; Jain, H. An Evolutionary Many-Objective Optimization Algorithm Using Reference-Point-Based Nondominated Sorting Approach, Part I: Solving Problems With Box Constraints. IEEE Trans. Evol. Comput. 2014, 18, 577–601. [Google Scholar] [CrossRef]
Xue, B.; Hao, X.; Liu, X.; Han, Z.; Zhou, H. Simulation of an NSGA-III based fireball inner-temperature-field reconstructive method. IEEE Access 2020, 8, 43908–43919. [Google Scholar] [CrossRef]
Jurdana, V.; Baressi Šegota, S. Convolutional Neural Networks for Local Component Number Estimation from Time–Frequency Distributions of Multicomponent Nonstationary Signals. Mathematics 2024, 12, 1661. [Google Scholar] [CrossRef]
Lopac, N.; Hržić, F.; Vuksanović, I.P.; Lerga, J. Detection of non-stationary GW signals in high noise from Cohen’s class of time-frequency representations using deep learning. IEEE Access 2021, 10, 2408–2428. [Google Scholar] [CrossRef]
Lopac, N. Detection of Gravitational-Wave Signals from Time-Frequency Distributions Using Deep Learning. Ph.D. Thesis, Faculty of Engineering, University of Rijeka, Rijeka, Croatia, 2022. [Google Scholar]
Khan, N.A.; Ali, S. Classification of EEG signals using adaptive time-frequency distributions. Metrol. Meas. Syst. 2016, 23, 251–260. [Google Scholar] [CrossRef]
Boashash, B.; Azemi, G.; Ali Khan, N. Principles of time–frequency feature extraction for change detection in non-stationary signals: Applications to newborn EEG abnormality detection. Pattern Recognit. 2015, 48, 616–627. [Google Scholar] [CrossRef]
Khan, N.A.; Ali, S.; Choi, K. An instantaneous frequency and group delay based feature for classifying EEG signals. Biomed. Signal Process. Control 2021, 67, 102562. [Google Scholar] [CrossRef]
Khan, N.A.; Ali, S. A new feature for the classification of non-stationary signals based on the direction of signal energy in the time–frequency domain. Comput. Biol. Med. 2018, 100, 10–16. [Google Scholar] [CrossRef] [PubMed]
Mohammadi, M.; Ali Khan, N.; Hassanpour, H.; Hussien Mohammed, A. Spike Detection Based on the Adaptive Time-Frequency Analysis. Circuits Syst. Signal Process. 2020, 39, 5656–5680. [Google Scholar] [CrossRef]
Mohammadi, M.; Pouyan, A.A.; Khan, N.A.; Abolghasemi, V. Locally Optimized Adaptive Directional Time-Frequency Distributions. Circuits Syst. Signal Process. 2018, 37, 3154–3174. [Google Scholar] [CrossRef]
Majumdar, K. Differential operator in seizure detection. Comput. Biol. Med. 2012, 42, 70–74. [Google Scholar] [CrossRef] [PubMed]
Stevenson, N.; O’Toole, J.; Rankine, L.; Boylan, G.; Boashash, B. A nonparametric feature for neonatal EEG seizure detection based on a representation of pseudo-periodicity. Med. Eng. Phys. 2012, 34, 437–446. [Google Scholar] [CrossRef]
Saulig, N.; Lerga, J.; Miličić, S.; Tomasović, Z. Block-adaptive Rényi entropy-based denoising for non-stationary signals. Sensors 2022, 22, 8251. [Google Scholar] [CrossRef]
Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; MIT Press: Cambridge, MA, USA, 2016; Available online: http://www.deeplearningbook.org (accessed on 25 November 2024).
Vranković, A.; Lerga, J.; Saulig, N. A novel approach to extracting useful information from noisy TFDs using 2D local entropy measures. EURASIP J. Adv. Signal Process. 2020, 2020, 18. [Google Scholar] [CrossRef]
Blanco, G.; Traina, A.J.; Traina, C., Jr.; Azevedo-Marques, P.M.; Jorge, A.E.; de Oliveira, D.; Bedo, M.V. A superpixel-driven deep learning approach for the analysis of dermatological wounds. Comput. Methods Programs Biomed. 2020, 183, 105079. [Google Scholar] [CrossRef]
Bayer, F.M.; Kozakevicius, A.J.; Cintra, R.J. An iterative wavelet threshold for signal denoising. Signal Process. 2019, 162, 10–20. [Google Scholar] [CrossRef]
Batista, G.E.A.P.A.; Prati, R.C.; Monard, M.C. A study of the behavior of several methods for balancing machine learning training data. SIGKDD Explor. Newsl. 2004, 6, 20–29. [Google Scholar] [CrossRef]

Figure 1. Considering synthetic signal with two crossing components: (a) TFD; (b) ideal (

{\hat{M}}_{t}

) versus estimated local number of components rounded to the nearest integer (

⌊ M_{t} ⌉

). Considering synthetic signal with two components exhibiting different amplitudes: (c) TFD; (d) ideal (

{\hat{M}}_{t}

) versus estimated local number of components rounded to the nearest integer (

⌊ M_{t} ⌉

).

Figure 1. Considering synthetic signal with two crossing components: (a) TFD; (b) ideal (

{\hat{M}}_{t}

) versus estimated local number of components rounded to the nearest integer (

⌊ M_{t} ⌉

). Considering synthetic signal with two components exhibiting different amplitudes: (c) TFD; (d) ideal (

{\hat{M}}_{t}

) versus estimated local number of components rounded to the nearest integer (

⌊ M_{t} ⌉

).

Figure 2. Block diagram of the proposed approach.

Figure 3. Synthetic signal examples for DNN training: (a) WVD, three QFM components in noise; (b) WVD, three LFM components; (c) ideal TFD, three QFM components in noise; (d) ideal TFD, three LFM components; (e) YALL1 (

λ_{opt} = 3.10

), three QFM components in noise; (f) YALL1 (

λ_{opt} = 1.60

), three LFM components.

Figure 3. Synthetic signal examples for DNN training: (a) WVD, three QFM components in noise; (b) WVD, three LFM components; (c) ideal TFD, three QFM components in noise; (d) ideal TFD, three LFM components; (e) YALL1 (

λ_{opt} = 3.10

), three QFM components in noise; (f) YALL1 (

λ_{opt} = 1.60

), three LFM components.

Figure 4. WVDs of the considered synthetic and real-world signals: (a)

z_{S 1} (t)

; (b)

z_{S 2} (t)

; (c)

z_{S 3} (t)

; (d)

z_{G} (t)

; (e)

z_{EEG} (t)

.

Figure 4. WVDs of the considered synthetic and real-world signals: (a)

z_{S 1} (t)

; (b)

z_{S 2} (t)

; (c)

z_{S 3} (t)

; (d)

z_{G} (t)

; (e)

z_{EEG} (t)

.

Figure 5. (Top) Scatter plot of predicted versus optimal regularization parameters obtained using: (a) MOPSO-LRE [18]; (b) proposed DNN. (Bottom) Probability densities for the discrepancy

λ - λ_{opt}

obtained using: (c) MOPSO-LRE [18]; (d) the proposed DNN. Calculated for 1000 examples from the validation set.

Figure 5. (Top) Scatter plot of predicted versus optimal regularization parameters obtained using: (a) MOPSO-LRE [18]; (b) proposed DNN. (Bottom) Probability densities for the discrepancy

λ - λ_{opt}

obtained using: (c) MOPSO-LRE [18]; (d) the proposed DNN. Calculated for 1000 examples from the validation set.

Figure 6. Reconstructed TFDs of the considered synthetic and real-world gravitational signals obtained using YALL1: (a)

z_{S 1} (t)

, [18],

λ = 6.5730

; (b)

z_{S 2} (t)

, [18],

λ = 0.770

; (c)

z_{S 3} (t)

, [18],

λ = 0.660

; (d)

z_{S 1} (t)

, DNN,

λ = 2.3758

; (e)

z_{S 2} (t)

, DNN,

λ = 5.2562

; (f)

z_{S 3} (t)

, DNN,

λ = 3.2020

.

Figure 6. Reconstructed TFDs of the considered synthetic and real-world gravitational signals obtained using YALL1: (a)

z_{S 1} (t)

, [18],

λ = 6.5730

; (b)

z_{S 2} (t)

, [18],

λ = 0.770

; (c)

z_{S 3} (t)

, [18],

λ = 0.660

; (d)

z_{S 1} (t)

, DNN,

λ = 2.3758

; (e)

z_{S 2} (t)

, DNN,

λ = 5.2562

; (f)

z_{S 3} (t)

, DNN,

λ = 3.2020

.

Figure 7. Reconstructed TFDs of the considered real-world gravitational and EEG signals obtained using YALL1: (a)

z_{G} (t)

, [18],

λ = 4.4620

; (b)

z_{EEG} (t)

, [18],

λ = 0.121

; (c)

z_{G} (t)

, DNN,

λ = 1.4731

; (d)

z_{EEG} (t)

, DNN,

λ = 0.01

.

Figure 7. Reconstructed TFDs of the considered real-world gravitational and EEG signals obtained using YALL1: (a)

z_{G} (t)

, [18],

λ = 4.4620

; (b)

z_{EEG} (t)

, [18],

λ = 0.121

; (c)

z_{G} (t)

, DNN,

λ = 1.4731

; (d)

z_{EEG} (t)

, DNN,

λ = 0.01

.

Table 1. Architectures of deep neural networks used.

Model	Architecture
FCNN	Flatten(256x256), Dense(1024, ReLU), BatchNorm, Dropout(0.5), Dense(512, ReLU), BatchNorm, Dropout(0.5), Dense(256, ReLU), BatchNorm, Dropout(0.5), Dense(128, ReLU), Dense(1, ReLU)
ResNet	Conv2D(32, 3x3, ReLU, Input: 256x256x1), BatchNorm, MaxPool(2x2), Conv2D(64, 3x3, ReLU), BatchNorm, MaxPool(2x2), Conv2D(128, 3x3, ReLU), BatchNorm, MaxPool(2x2), Residual Block: [Conv2D(128, 3x3, ReLU) → BatchNorm → Conv2D(128, 3x3, ReLU) → BatchNorm → Add(Input)], Flatten, Dense(1, ReLU)
DenseNet	Conv2D(32, 3x3, ReLU, Input: 256x256x1), BatchNorm, MaxPool(2x2), Dense Block 1: [Conv2D(32, 3x3, ReLU) → BatchNorm → Concat], MaxPool(2x2), Dense Block 2: [Conv2D(64, 3x3, ReLU) → BatchNorm → Concat], MaxPool(2x2), Dense Block 3: [Conv2D(128, 3x3, ReLU) → BatchNorm → Concat], GlobalAvgPool, Dense(1, ReLU)
CNN with Attention	Conv2D(32, 3x3, ReLU, Input: 256x256x1), BatchNorm, MaxPool(2x2), Conv2D(64, 3x3, ReLU), BatchNorm, MaxPool(2x2), Conv2D(128, 3x3, ReLU), BatchNorm, MaxPool(2x2), Attention Block: [Conv2D(128, 1x1, ReLU) → Softmax(Channels) → Multiply(Input)], Flatten, Dense(1, ReLU)

Table 2. Comparison of MAE values for different DNN architectures. Bolded values highlight the best-performing architecture.

Model	Training MAE	Validation MAE
FCNN	0.0960	0.1318
ResNet	0.0794	0.1011
DenseNet	0.0830	0.1026
CNN with Attention	0.0919	0.0956

Table 3. Performance comparison of reconstructed TFDs obtained using the YALL1 algorithm with

λ

parameter optimized via MOPSO with LRE [18] versus the proposed DNN for the signals

z_{S 1} (t)

,

z_{S 2} (t)

, and

z_{S 3} (t)

. Values in bold indicate superior reconstructed TFD for each signal.

Table 3. Performance comparison of reconstructed TFDs obtained using the YALL1 algorithm with

λ

parameter optimized via MOPSO with LRE [18] versus the proposed DNN for the signals

z_{S 1} (t)

,

z_{S 2} (t)

, and

z_{S 3} (t)

. Values in bold indicate superior reconstructed TFD for each signal.

			$z_{S 1} (t)$		$z_{S 2} (t)$		$z_{S 3} (t)$
			$Υ_{z} (t, f)$		$Υ_{z} (t, f)$		$Υ_{z} (t, f)$
			N	P	N	P	N	P
MOPSO-LRE	$ρ_{ideal} (t, f)$	N	62,725	253	62,410	1343	62,892	852
		P	1464	838	866	661	1025	511
ineDNN	$ρ_{ideal} (t, f)$	N	62,564	414	63,594	159	63,490	254
		P	1358	944	1127	400	1134	402
ine			MOPSO-LRE	DNN	MOPSO-LRE	DNN	MOPSO-LRE	DNN
Accuracy			0.9737	0.9729	0.9662	0.9803	0.9712	0.9787
Precision			0.7681	0.6951	0.3298	0.7156	0.3749	0.6128
Recall			0.3640	0.4101	0.4329	0.2620	0.3327	0.2617
F1 score			0.4940	0.5158	0.3744	0.3835	0.3525	0.3668
Execution time (obtaining $λ$ + YALL1), t [s]			7788 + 4.17	0.0598 + 4.12	7911 + 3.84	0.0591 + 3.89	7889 + 3.94	0.0609 + 3.97

Table 4. Performance comparison of reconstructed TFDs obtained using YALL1 with

λ

optimized via MOPSO with LRE [18] versus the proposed DNN for signals

z_{S 1} (t)

,

z_{S 2} (t)

, and

z_{S 3} (t)

in AWGN with SNR

= [0, 3, 6, 9]

dB. Precision, recall, and F1 score values are averaged over 1000 noise realizations.

Table 4. Performance comparison of reconstructed TFDs obtained using YALL1 with

λ

optimized via MOPSO with LRE [18] versus the proposed DNN for signals

z_{S 1} (t)

,

z_{S 2} (t)

, and

z_{S 3} (t)

in AWGN with SNR

= [0, 3, 6, 9]

dB. Precision, recall, and F1 score values are averaged over 1000 noise realizations.

	MOPSO-LRE	DNN	MOPSO-LRE	DNN	MOPSO-LRE	DNN	MOPSO-LRE	DNN
	0 dB		3 dB		6 dB		9 dB
$z_{S 1} (t)$
$\bar{Precision}$	0.2042	0.3356	0.3506	0.4512	0.4759	0.5004	0.6460	0.6788
$\bar{Recall}$	0.2849	0.3188	0.3267	0.3678	0.3238	0.3787	0.3311	0.3888
$\bar{F 1 score}$	0.2321	0.3248	0.3330	0.3946	0.3821	0.4301	0.4301	0.4941
$z_{S 2} (t)$
$\bar{Precision}$	0.2290	0.5866	0.3188	0.6588	0.3661	0.7222	0.4127	0.7325
$\bar{Recall}$	0.2978	0.2211	0.3845	0.2509	0.4231	0.2801	0.4276	0.2898
$\bar{F 1 score}$	0.2588	0.3189	0.3401	0.3620	0.3888	0.3987	0.4088	0.4140
$z_{S 3} (t)$
$\bar{Precision}$	0.2155	0.4325	0.3344	0.5878	0.4014	0.6255	0.4454	0.6328
$\bar{Recall}$	0.2544	0.2144	0.3188	0.2544	0.3488	0.2878	0.3566	0.3011
$\bar{F 1 score}$	0.2301	0.2814	0.3164	0.3541	0.3701	0.3901	0.3944	0.4051

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Jurdana, V. Deep Neural Networks for Estimating Regularization Parameter in Sparse Time–Frequency Reconstruction. Technologies 2024, 12, 251. https://doi.org/10.3390/technologies12120251

AMA Style

Jurdana V. Deep Neural Networks for Estimating Regularization Parameter in Sparse Time–Frequency Reconstruction. Technologies. 2024; 12(12):251. https://doi.org/10.3390/technologies12120251

Chicago/Turabian Style

Jurdana, Vedran. 2024. "Deep Neural Networks for Estimating Regularization Parameter in Sparse Time–Frequency Reconstruction" Technologies 12, no. 12: 251. https://doi.org/10.3390/technologies12120251

APA Style

Jurdana, V. (2024). Deep Neural Networks for Estimating Regularization Parameter in Sparse Time–Frequency Reconstruction. Technologies, 12(12), 251. https://doi.org/10.3390/technologies12120251

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Deep Neural Networks for Estimating Regularization Parameter in Sparse Time–Frequency Reconstruction

Abstract

1. Introduction

2. Sparse Time-Frequency Distributions

2.1. Measuring Sparse TFDs: Existing Approaches

Limitations of the Current Meta-Heuristic Optimization Technique

3. The Proposed Deep Neural Network-Based Approach for Determining Regularization Parameter

3.1. DNN Architectures

3.2. Summary of the Proposed DNN-Based Approach

4. Experimental Results and Discussion

4.1. Experiment Setup

4.2. Performance of DNN Architectures

4.3. Comparison with the Current Meta-Heuristic Optimization Approach

4.4. Computational Complexity

4.5. Noise Sensitivity Analysis

4.6. Limitations of the Proposed Approach

5. Conclusions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI