Now that SpectraMelt is producing accurate compressed samples, the next step is to show how it can be leveraged to improve A2I designs. One way to do this is to produce a new CS recovery method within the simulator that exceeds current, state-of-the-art-practice, traditional iterative CS recovery methods in terms of speed and recovery accuracy. Machine learning (ML) is a signal processing technique that has exploded in popularity in recent years. The most successful ML technique has been the Artificial Neural Network (ANN) originally developed by psychologist Frank Rosenblatt in 1958 [
14]. While ANNs have been applied to virtually every problem space thanks to the rise of Large Language Models (LLMs), only recently have they been applied to A2I. In 2023, a team from the University of Electronic Science and Technology of China used LLMs to try and classify the compressed output of the NYFR against a group of radar signal types [
15].
The ANN chosen for implementation within SpectraMelt to decompress NYFR outputs is called the Multilayer Perceptron (MLP). An MLP is a simple feedforward network that maps fixed-size inputs to outputs, while an LLM is built on the transformer architecture with self-attention, enabling it to capture long-range dependencies and contextual relationships in variable-length sequences. The architecture of the MLP implemented here is modeled after a more advanced architecture inspired by sparse Bayesian learning (SBL) called Learned-SBL [
16]. This new network contains an input layer, one hidden layer, and one output layer. Each layer is fully connected with the previous layer. The input layer contains the same number of inputs as the number of samples captured at the simulated output of the NYFR. The hidden and output layer perceptrons all contain the same number of outputs as the uncompressed input signal. A toy example with four nodes per layer is shown in
Figure 6.
Another consideration is the type of loss function used to train the network during back-propagation. Initially, a mean absolute error (MAE) loss function was selected as it calculates the average of the absolute differences between predictions and true values. This seemed in line with the traditional CS principle from Equation (
2) where the recovery problem is relaxed to a convex
-minimization problem by setting
. In reality, this spread the error for the training over every output node, resulting in no signal recovery, only output noise. A custom root-mean-squared error (RMSE) function
given by
was created for the network to obtain the desired results.
3.1. Dataset Creation
With a working MLP implementation in place, it was time to create datasets from the NYFR digital twin. These datasets would serve three purposes:
Training data for the new MLP;
Signals used for recovery to assess MLP performance versus other CS recovery algorithms;
Assess the impact of varying NYFR LO parameters on data recovery.
The datasets were created as an extension of the simulator settings found here [
7], with
= 1 MHz,
= 400 Hz,
= 100 Hz,
= 100 Hz,
= 50 Hz,
2 s, and
= 2 s. This sets the number of positive Nyquist Zones
.
Each input dataset was generated from a list of all the possible integer input frequencies that can exist within the wideband filter range. In this case, the possible frequencies , where , , …, , giving the total number of possible frequencies . Every comprises input signals with tones drawn from and taking n at a time with , resulting in a total of four sets containing 79,800 signals per set. The individual tones within each input signal had random amplitudes between 0.5 and 1 taken from a uniform random variable. The input signals created in these initial datasets are idealized to show the functionality of SpectraMelt and quickly compare implemented recovery algorithms. This means that no noise is added to them and signal tones are expressed as linear measures of magnitudes instead of the typical dB.
Recovery algorithms for the NYFR are directly influenced by how the LO parameters are tuned, since these parameters govern the modulation index and resulting frequency zone definitions. Specifically, the modulation index
M uniquely determines the mapping of signals into frequency zones, but it is only valid under the condition that the clock modulation
remains narrowband. This requires the maximum rate of change in
given by
to be much smaller than
. In practice, this translates to at least two orders of magnitude of separation between the modulation deviation
and the local oscillator frequency
. Furthermore, the system is typically constrained such that
, tightly linking the oscillator design to the sampling process and imposing additional structure on the recovery algorithms.
Selection of specific LO parameters within the current literature relies on Monte Carlo simulations to select appropriate values within the constraints given above [
11]. This search was performed by first developing some qualitative analysis about LO parameter selection and then using recovery results from a single CS recovery algorithm. Understanding how to select appropriate LO parameters also requires understanding how the LO signal modulation affects the shape of the modulated pulse train
in both time and frequency domains using the analysis performed here [
7]. The modulation frequency
must be tuned to match the size of the zones created by the NYFR with some integer multiple of the width of the LO frequency domain lobes. The matching of lobe widths will reduce mutual coherence between zones, increasing recovery accuracy.
Beyond the analysis and constraints to the LO parameters just described, there is currently no prescription on which specific values to select for
and
given
and
, meaning that some Monte Carlo simulations are still required. On top of that, there has been little analysis to determine if these values are specific only to one selected CS recovery algorithm. SpectraMelt helps address these issues by generating
datasets for the NYFR with varying LO parameters of the digital twin. For the system parameters used to create each
, the following list of LO parameters was used to create 16 unique
datasets:
,
,
,
,
. These LO parameters were also used to create CS recovery dictionaries needed for reconstruction based on
Figure 3.
3.2. Compressed Data Recovery
Improvements to
based on comparisons between the actual output of the NYFR Digital Twin
and the approximation given by
was discussed in [
7]. However, this approach is flawed. What is really needed is adjustments based on the initial guess
, where
is the pseudoinverse of the NYFR recovery dictionary. Graphical examination of
applied to Digital Twin outputs shows that orientation mismatches are corrected using the pseudoinverse. The magnitude discrepancies are also corrected within iterative CS algorithms by normalizing the output
. To correct for magnitude mismatches within the MLP training data and improve reconstruction speed,
is created by pre-multiplying the measurement matrix with a magnitude correction factor to give
. An example signal is shown in
Figure 7.
In this initial implementation, the magnitude correction factor was selected empirically as a quick, practical means of aligning the pre-multiplication signal with the original input waveform and ensuring that the resulting frequency-domain magnitudes remained within a small percentage of their true values. This ad hoc choice provided inputs of an appropriate scale for effective MLP training, but it is not a principled or generalizable solution. The correct approach is to normalize all input and output signals prior to training so that the network operates on consistent, statistically comparable input ranges. Full-signal normalization eliminates the need for manual correction factors and supports reproducibility across datasets, hardware configurations, and noise environments. Implementing a rigorous normalization pipeline is a priority for the next major revision of SpectraMelt, where this temporary workaround will be replaced with a theoretically grounded, systematically optimized preprocessing stage. This will also enable the use of more complex activation functions—such as Softmax or Sigmoid—that assume inputs and outputs confined to well-defined numerical ranges (e.g., [0, 1]), ensuring stable gradients and more reliable convergence during training.
MLPs are trained using supervised learning, where the network weights are optimized over many passes through the dataset to minimize a loss function that maps inputs to desired outputs. In the present work, training begins by splitting the dataset into training and test partitions. The network is then trained for with a maximum number of epochs set, with each epoch processing the training data in batches of a specified number of training signals. Batch-based optimization significantly reduces memory requirements and accelerates convergence compared to processing the entire dataset at once. During each epoch, the model shuffles the training samples to reduce overfitting and ensure that gradient updates are not biased by sample ordering. These design choices favor stable optimization and permit efficient use of compute hardware, allowing MLPs to learn complex nonlinear mappings that iterative sparse-recovery algorithms cannot represent.
To further control training cost and prevent unnecessary computation, early stopping is employed through a set of configurable parameters. The training process monitors a chosen metric—monitor, typically “val_loss”—and halts if improvements fall below a minimum threshold delta for a specified number of epochs. In typical training workflows, “val_loss” represents the model’s error on a held-out validation set, providing an unbiased estimate of how well the network generalizes to unseen data and serving as the primary metric for guiding early stopping. Early-stopping checks are delayed until enough epochs have passed to establish a meaningful performance trend. Together, these mechanisms provide a principled way to limit training duration and computational expenditure, in contrast to iterative methods, which must solve a new optimization problem for each input instance. Once an MLP is trained, inference is effectively instantaneous, making the up-front training cost a one-time expense that yields substantial runtime speed advantages.
There is a need to systematically compare the performance of this new MLP network against the previous state-of-the-art iterative CS recovery methods. Orthogonal Matching Pursuit (OMP), Spectral Projected Gradient for L1 minimization (SPGL-1), and Iterative Hard Thresholding (IHT) were added to SpectraMelt as the state-of-the-practice methods because of their relative successes in CS recovery. The OMP algorithm used for reconstruction was modified from the original algorithm within the Scikit-Learn Python library to allow for complex valued dictionaries. A custom complex version of IHT was also created based on [
17]. SPGL-1 reconstruction was implemented using the SPGL1 Python library found in the PyPi repository.
The input and output datasets from the previous section have been recovered using four separate CS recovery algorithms incorporated into SpectraMelt: IHT, OMP, SPGL1, and the newly trained MLP network. Several assumptions about the recovery results should be mentioned before discussing the recovery results. One is that the IHT algorithm was given the correct number of unknown tones a priori. This was carried out for baseline results, as IHT is the oldest of the four algorithms. No other algorithm had this knowledge. Secondly, an individual MLP network was trained per signal set, for a total of four networks. These networks only recovered signals with the corresponding number of tones per signal. This was carried out because the training of the same MLP on different datasets caused catastrophic forgetting, a well-known limitation of MLPs when trained on tasks sequentially. Catastrophic forgetting refers to the tendency of neural networks to lose previously learned information when trained on new tasks. Various strategies, such as regularization or memory-based methods, have been proposed to mitigate this effect [
18].
For each recovery set, several metrics were used to assess the quality of the recovered signals. In order for a signal tone to be considered recovered, it must first exactly match one of the input tone frequencies. If the magnitude exceeds half of the original tone’s magnitude, then the signal tone is marked as recovered. If any recovered signal also exceeds this threshold but does not match an input tone frequency, it is considered a spur. The average magnitude error of the recovered signals is considered along with the average magnitude of the spurs. The recovery accuracy is found by dividing the number of recovered signals by the total number of input signals. The spur rate divides the total number of spurs by the total number of input signals. An example of signal recovery from each recovery method in SpectraMelt is shown in
Figure 8.
One of the broad questions that this work is trying to answer is “What are the specific NYFR LO settings that maximize reconstruction?” From the results obtained by the three iterative algorithms, no effect on signal recovery was seen by varying LO parameters. However, there were variations seen from the MLP recovery results which seemed to favor
. No accompanying variations were seen by different
values with the previous frequency across all datasets. It should be noted that the standard deviation in recovery accuracy between different datasets and LO parameters never exceeded 0.009 from the MLP recovery results. These results together seem to suggest that there is no preferable LO parameter setting as long as the previously stated constraints are held. This matches the previous analysis performed, showing that
and
jointly constrain the system’s dynamic reconstruction range, balancing signal recovery, zone identification, pulse resolution, and phase-noise limitations [
11].
Table 1 shows the compiled results from the NYFR digital twin simulations using the datasets described in the previous two sections. All reported values are averages over 100 recovered signals. This table shows a clear distinction between the IHT and OMP algorithms. The a priori knowledge given to the IHT improved the average spur rate produced by the algorithm. It did not, however, improve the reconstruction accuracy, as this is the main measure of a selected CS algorithm’s performance. This shows that the OMP is a better CS algorithm for the NYFR than the IHT. There is also a clear distinction between the two older algorithms, IHT and OMP, versus the newer two, SPGL1 and MLP, in terms of recovery accuracy. Of final note, the MLP network’s recovery accuracy is on average about double that from SPGL1, 63.8% vs. 38.3% as computed from the values in
Table 1, while almost completely eliminating spurs.