Simpliﬁed Welch Algorithm for Spectrum Monitoring

: Power Spectral Density (PSD) is an essential representation of the signal spectrum that depicts the power measurement content versus frequency. PSD is typically used to characterize broadband random signals and has a variety of usages in many ﬁelds like physics, engineering, biomedical, etc. This paper proposes a simple and practical method to estimate the PSD based on the Welch algorithm for spectrum monitoring. The proposed method can be easily implemented in most of software-based systems or low-level Field-Programmable Gate Arrays (FPGAs) and yields a smooth overview of the spectrum. The original Welch method utilizes the average of the amplitude squared of the previous Fast Fourier Transform (FFT) samples for better estimation of frequency components and noise reduction. Replacing the simple moving average with a weighted moving average can signiﬁcantly reduce the complexity of the Welch’s method. In this way, the amount of required Random Access Memory (RAM) is reduced from K (where K is the number of FFT packets in averaging) to one. This new method allows users to adjust the dependency of the PSD on the previous observed FFTs and its smoothness by setting only one feedback parameter without any hardware change. The obtained results show that the algorithm gives a clear spectrum, even in the noisy situation because of the signiﬁcant Signal to Noise Ratio (SNR) enhancement. The trade-off between spectrum accuracy and time convergence of the modiﬁed algorithm is also fully analysed. In addition, a simple solution based on Xilinx Intellectual Property (IP), which converts the proposed method to a practical spectrum analyzer device, is presented. This modiﬁed algorithm is validated by comparing it with two standard and reliable spectrum analyzers, Rohde & Schwarz (R&S) and Tektronix RSA600. The modiﬁed design can track any signal type as the other spectrum analyzers, and it has better performance in situations where the power of the desired signal is weak or where the signal is mixed with the background noise. It can display the spectrum when the input signal power is 5 dB lower than the visible threshold level of R&S and Tektronix. In both narrowband and wideband scenarios, the new implemented design can still display frequency components 5 dB higher than the noise, while the output spectrum of other analyzers is completely covered by noise.


Introduction
The development of intelligent adaptive technology for spectrum monitoring is an emerging topic for signal processing, especially for satellite broadcasting and radio communication signal analysis [1][2][3]. Frequency power distribution and phase represent the system response to changes in bandwidth, gain, phase shift, harmonics, etc. The recent preferred interest of these characteristics is to locate and reduce the interference caused by adjacent signal communication bands or co-channel emissions [4].
A problem exists regarding the accessibility to a clear power spectrum for smaller and more accessible devices. Most of the available devices use complex algorithms and circuitry to present the clearest overview of the spectrum for a specified application [5]. It mostly Discrete time signals result from sampling of continuous-time analog signals that are converted into digital values for future analysis. The interval between consecutive measurements is specified by the sampling period. Thus, discrete time signals are not defined between sampling intervals, which means that the highest possible observable frequency cannot exceed the Nyquist limit [16] that is equal to half of the sampling frequency. Even with this known trade-off, the DFT is a powerful tool for analyzing these types of signals since it can reveal the periodicity in the input signal as well as the relative power for each frequency component. The DFT is defined mathematically as: where N is the number of input samples, k is an integer index, 2π/N is the fundamental frequency resolution, x = {x[0], x [1], . . . x[n]} is the window of input samples, and  [2] . . .
many mathematical operations such as 4N 2 real multiplications and N (4N − 2) real additions. The FFT is a computational implementation of the DFT and it reduces the complexity of computing the DFT from O N 2 to O Nlog N that is more numerically efficient than the DFT [17].

Welch's Method
The main usage of the Fourier transform is to get the frequency samples of an input signal. For a discrete signal, the FFT can be applied to a certain number of samples or, depending on the sampling frequency, a certain amount of time of the signal. Represented spectrum in DFT is limited to half of the sampling frequency. This method is widely used, but the problem is the ambient noise and limitation due to the finite number of samples in the input window. The situation of signals that travel in communication channels is even worse. They experience channel with additive white Gaussian noise (AWGN). Depending on the travelling method and the distance between transmitter and receiver, the noise level may cover the signal and make it harder for the receiver to extract the PSD with acceptable details. In noisy situations, spectrum monitoring requires more and more data time samples for the correct estimation of the frequency components.
The Welch algorithm [18,19] is a nonparametric method to estimate the PSD and it makes the frequency spectrum smoother than the raw FFT output. Let us consider that the FFT is calculated over the entire duration of the signal. In the Welch algorithm, instead of processing the FFT over the entire time domain, the signal is separated in windows with the same size. Window size affects the clarity of the result by cutting frequencies with periods larger than the window. Windowing is taking a sample of a larger dataset and tapering the signal at the edges of each interval. This makes the signal smoother without sharp transitions that can disturb the frequency spectrum representation [11]. There are different types of windowing sequence like Hamming and Blackman, which affect the result differently. Each window overlaps the adjacent windows by a certain factor, which can be as much as 50% of the total window size. Overlapping helps to reduce the loss of information because of the tapering and display a more reliable periodogram. The Fourier transform is calculated on each interval, and the value is squared. This results in a modified periodogram value that is the base of the PSD calculation. Finally, the average of all periodograms are calculated as PSD estimation. Averaging enhances the SNR and it is optimal for spectrum monitoring. The issue with the Welch algorithm is that the frequency resolution is reduced compared to the FFT. Let us consider a discrete time signal s with N samples: Appl. Sci. 2021, 11, 86 4 of 23 The signal can be separated in K smaller intervals with M length and V overlapping: 1 : s 1 = x [1], x [2], . . . , x[M] 2 : . . .
where s i = {s i [1], s i [2], . . . s i [M]} represents the ith window and K is number of the involved window in the PSD calculation. The DFT is calculated for each window: where C is normalization factor: Calculated periodogram values from different windows are averaged and the Power Spectral Density estimate is obtained: The number of packets involved in averaging directly affects the estimated PSD in different ways. First, as more packets are involved, estimation uses more past data and a smoother spectrum is obtained. In other words, average order K is controlling the dependency on the past. Larger K values help the system to estimate each frequency component based on more observations. Second, if there are fast variations in a frequency component (like a carrier with a hopping center frequency), large averaging can have some drawback on estimation and stop the system from monitoring the fast variations in spectrum. Thus, a robust monitoring system needs to have a variable K to adjust itself from time to time based on the situation. For example, if the target is to detect a weak frequency component under noisy conditions, a K with a larger value can help the monitoring system to recognize it more clearly. In other situations, when the system is trying to track a fasthopping carrier frequency, smaller K values are more useful because the averaging over time can filter fast variations.
Implementing this moving average requires K number of Random Access Memory (RAM) blocks with N F size that save observed FFT vectors for averaging. Thus, the complexity and required space on FPGA depends directly on the considered number of FFT packets for averaging. Another drawback is that to support a device with variable average order K, one has to provide K max RAMs in FPGA and use as much as is required depending on the situation.

Proposed Algorithm
As presented before, the Welch samples are an estimation of the power spectral density based on the arithmetic average of multiple FFT [20] scans. The modified version replaces the running average by weighted averaging to simplify and allow for real-time calculation. To better explain this concept, let us consider the moving average over an input data stream by a window with the size of N p . The result can be expressed by the following Equation (9): where y i is the moving average at sample time i and x i is the input data stream. The Welch algorithm uses averaging (or arithmetic averaging) in order to achieve two important goals: • SNR improvement: averaging over more samples can reduce the noise power level and enhance the SNR of desired signal. • Precise estimation: by utilizing FFT samples from the past, each frequency component can be estimated with more precision.
However, arithmetic averages of past samples have the following consequences [21]: • Implementing arithmetic averaging of N order, requires N RAM components and massively increases the complexity for real time implementation. • All the previous samples are weighted by the same coefficient. For example, the sample with distance of 10 time intervals from the current sample is weighted by the same gain as the sample with the distance of 100 time intervals. However, in many systems, it is suitable to weight the samples based on their distance from current samples.
Weighted averaging is widely used in the digital signal processing domain, especially in image processing for noise reduction and signal enhancement [22,23]. In the image processing area, weighted averaging helps greatly with the noise reduction while preserving the fast variations in image edge. In such filtering scenarios, pixels in filtering windows are weighted based on their distance from the central pixel.
This paper tries to replace the arithmetic averaging by the weighted averaging to target two important points:

•
Weighting each FFT packet based on its distance from the current packet, this idea helps to preserve the fast variable frequency component in spectrum.

•
A recursive and simple algorithm can replace the required memories for the arithmetic averaging.
It means that the system should weight samples based on their distance from the current time instance: where G = ∑ N p −1 m=0 a m is normalization gain, y i is calculated weighted average in ith interval, and α 0 , α 1 , α 2 . . . α N p −1 are utilized weighting coefficients. The dependency degree for each input sample is determined by selected weighting coefficient α m . Now, the dependency of y i is extended to the past by replacing all the independent weights by exponential weights.
To make the averaging simple and recursive, we first extend the dependency to all previous input samples: where G = ∞ ∑ m=0 α m and x i = 0 for i < 0 and y i is the weighted averaging of all the past input samples. Now, a recursive algorithm can be extracted: allows us to calculate an averaging with a single multiplication and addition. Figure 1    The displayed design in Figure 1 explains the basic idea, but its implementation is still complex because of several reasons:


This design needs each FFT sample to implement weighted averages separately, thus the amount of multiplications and summations will increase significantly.  It is not flexible enough to support different FFT sizes. For example, if the FFT size changes from 1 K to 2 K, all of the design must change completely to support it.  Certain types of FFT blocks, e.g., in the Xilinx DSP Generator library, work in a series manner that makes the design more complex.
Given these reasons, a simple solution with the help of the RAM is presented to calculate Welch samples based on the weighted averaging. Figure 2 symbolically explains the modified design of Welch implementation. This design reduces the required memory to single RAM by NFFT address. Each averaged FFT samples are added to the previous averaged samples that are multiplied by the factor. The only consideration here is the use of a FFT computation in series instead of a parallel version.
For each segment of input data stream with the length of M, a windowed point FFT is calculated. After the M clock, M samples of a segment enter this block, then it calculates the point FFT of this sample. After calculation, FFT samples are ready at the output, sample by sample and synchronized with the clock. Each time that FFT samples of an input segment are ready, the FFT block generates a "Data Ready" pulse that determines the start point of the sample series. The "Data Ready" pulse is used to reset a coun- The displayed design in Figure 1 explains the basic idea, but its implementation is still complex because of several reasons:

•
This design needs each FFT sample to implement weighted averages separately, thus the amount of multiplications and summations will increase significantly. • It is not flexible enough to support different FFT sizes. For example, if the FFT size changes from 1 K to 2 K, all of the design must change completely to support it. • Certain types of FFT blocks, e.g., in the Xilinx DSP Generator library, work in a series manner that makes the design more complex.
Given these reasons, a simple solution with the help of the RAM is presented to calculate Welch samples based on the weighted averaging. Figure 2 symbolically explains the modified design of Welch implementation. This design reduces the required memory to single RAM by NFFT address. Each averaged FFT samples are added to the previous averaged samples that are multiplied by the α factor. The only consideration here is the use of a FFT computation in series instead of a parallel version.
ter. The output number of the counter will read the previous value and save the FFT sample into the RAM at the correct address where each address is a specific frequency PSD estimation.

Theoretical Analysis
The following analysis explains the equivalency between weighted averaging with the α parameter and arithmetic averaging with N order based on the Signal to Noise Ratio (SNR) enhancement. SNR is defined as: where and are the input signal and the white Gaussian noise power respectively. Reference [24] explains the effect of signal averaging on a steady signal S with an associated noise variance of 2 . It is proven that if observed signals in averaging iteration are correlated, after N time averaging, is improved by factor of N: = .
[dB] = 10 10 + [dB] Following the same way as [24], it can be proven that for weighted averaging ℎ is improved by factor For better understanding, the same scenario has been simulated. First, a discrete signal with 1000 samples is generated and white Gaussian noise with SNR = 15 dB is added to the desired signal. Then, the system tries to increase the SNR with the help of two different averaging types. Figure 3c displays versus order N. Clearly, more improvement is obtained by increasing the N order. In another figure, the For each segment of input data stream with the length of M, a windowed N F point FFT is calculated. After the M clock, M samples of a segment enter this block, then it calculates the N F point FFT of this sample. After calculation, FFT samples are ready at the output, sample by sample and synchronized with the clock. Each time that FFT samples of an input segment are ready, the FFT block generates a "Data Ready" pulse that determines the start point of the sample series. The "Data Ready" pulse is used to reset a counter. The output number of the counter will read the previous value and save the FFT sample into the RAM at the correct address where each address is a specific frequency PSD estimation.

Theoretical Analysis
The following analysis explains the equivalency between weighted averaging with the α parameter and arithmetic averaging with N order based on the Signal to Noise Ratio (SNR) enhancement. SNR is defined as: where P S and P Z are the input signal and the white Gaussian noise power respectively. Reference [24] explains the effect of signal averaging on a steady signal S with an associated noise variance of σ 2 . It is proven that if observed signals in averaging iteration are correlated, after N time averaging, SNR Averaging is improved by factor of N: Following the same way as [24], it can be proven that for weighted averaging SNR weighted Averaging is improved by factor 1 1−a : For better understanding, the same scenario has been simulated. First, a discrete signal with 1000 samples is generated and white Gaussian noise with SNR = 15 dB is added to the desired signal. Then, the system tries to increase the SNR with the help of two different averaging types. Figure 3c displays SNR Averaging versus order N. Clearly, more SNR Averaging improvement is obtained by increasing the N order. In another figure, the output SNR Weighted Averaging after weighted averaging with different α parameter is displayed. It can be seen weighted averaging with α = 0.99 improve SNR by the gain around 20 dB. These two figures allow the user to see which α parameter in weighted averaging is equivalent to which N order in normal averaging. Based on this comparison, α = 0.99 is equivalent to normal averaging with order N = 100. This comparison allows the user to select the α parameter based on two new criteria: output ℎ after weighted averaging with different α parameter is displayed. It can be seen weighted averaging with α = 0.99 improve SNR by the gain around 20 dB. These two figures allow the user to see which α parameter in weighted averaging is equivalent to which N order in normal averaging. Based on this comparison, α = 0.99 is equivalent to normal averaging with order N = 100. This comparison allows the user to select the α parameter based on two new criteria: The parameters selected for the weighted averaging directly affects the behavior of the system in two ways: 1. Convergence time: At first, it seems that the sorted values inside the RAM will grow to an unlimited value. However, because α is smaller than one, the sorted values (Welch spectrum) grow to a certain value and after that the system will converge to a final spectrum. A larger α needs more convergence time and generates larger sorted values in the RAM. Thus, if the system is monitoring data with fast variations, smaller values should be used to enable faster tracking. Another important point here is the The parameters selected for the weighted averaging directly affects the behavior of the system in two ways:

1.
Convergence time: At first, it seems that the sorted values inside the RAM will grow to an unlimited value. However, because α is smaller than one, the sorted values (Welch spectrum) grow to a certain value and after that the system will converge to a final spectrum. A larger α needs more convergence time and generates larger sorted values in the RAM. Thus, if the system is monitoring data with fast variations, smaller values should be used to enable faster tracking. Another important point here is the hardware limitation for saving the large values inside the RAM. If the saved values are larger than the computational capability of system, a saturation error will happen. 2.
Obtained spectrum shape: The α determines the dependency of the obtained spectrum to the past. A larger α means more averaging over the past FFT samples. As a result, the spectrum obtained from a larger α is smoother with the less noise.
To better understand the difference between the original Welch method [7] and this modified method, the frequency response of a running average is compared to the weighted averaging. Running average has the following frequency response: Figure 4 presents frequency response of the arithmetic average with different order N. The frequency response is generally attenuated and for averaging with larger order, larger attenuation is observed.
Sci. 2021, 11, x FOR PEER REVIEW hardware limitation for saving the large values inside the R are larger than the computational capability of system, a satu 2. Obtained spectrum shape: The α determines the dependen trum to the past. A larger α means more averaging over the result, the spectrum obtained from a larger α is smoother wi To better understand the difference between the original W modified method, the frequency response of a running avera weighted averaging. Running average has the following frequen Figure 4 presents frequency response of the arithmetic aver N. The frequency response is generally attenuated and for avera larger attenuation is observed.  Figure 5 represents the frequency response of the modified m order discrete low-pass filter, the curve is significantly smoother seen below, a higher α value creates a greater initial gain. The transfer function of the modified version can be expressed as the following: Figure 5 represents the frequency response of the modified method. Since it is a first order discrete low-pass filter, the curve is significantly smoother for each frequency. As seen below, a higher α value creates a greater initial gain.
( ) = 1 1 − −1 Figure 5 represents the frequency response of the modified method. Since it i order discrete low-pass filter, the curve is significantly smoother for each freque seen below, a higher α value creates a greater initial gain.  As it can be seen in Figure 6, the convergence time and gain are different for each different alpha (α) (feedback factor) value. By analyzing the system response, a normalization factor can be added after the modified Welch algorithm to cancel the gain based on the α parameter.
Appl. Sci. 2021, 11, x FOR PEER REVIEW 10 of 23 As it can be seen in Figure 6, the convergence time and gain are different for each different alpha (α) (feedback factor) value. By analyzing the system response, a normalization factor can be added after the modified Welch algorithm to cancel the gain based on the α parameter. Final gain G for the selected parameter can be calculated based on the infinite geometric series: Then the normalization factor (NF) is: The correction can be added after the algorithm to retrieve a normalized PSD. Figure  7 presents the corrected transfer function of the modified Welch algorithm after the normalization. Comparing Figure 4 with Figure 7 reveals that both systems attenuate frequency components with fast variation, but the modified version has a smoother and more uniform response than the original one and removes the big gaps in frequency response of moving average. Final gain G for the selected parameter can be calculated based on the infinite geometric series: Then the normalization factor (NF) is: The correction can be added after the algorithm to retrieve a normalized PSD. Figure 7 presents the corrected transfer function of the modified Welch algorithm after the normalization. Comparing Figure 4 with Figure 7 reveals that both systems attenuate frequency components with fast variation, but the modified version has a smoother and more uniform response than the original one and removes the big gaps in frequency response of moving average.
The correction can be added after the algorithm to retrieve a normalized PSD. Figure  7 presents the corrected transfer function of the modified Welch algorithm after the normalization. Comparing Figure 4 with Figure 7 reveals that both systems attenuate frequency components with fast variation, but the modified version has a smoother and more uniform response than the original one and removes the big gaps in frequency response of moving average.  The final point to address is the saturation time based on the α factor. Since the formula is exponential, the following expression can be used to estimate the number of FFT scans until a certain precision of the saturation for a noiseless signal: where ε is the precision of the saturation. The number of scans is an approximation of the total required FFT packets until partial saturation.

Simulation Setup
This section explains how the modified algorithm can be implemented simply in MATLAB Simulink as a development environment. Figure 8 displays the block diagram of the design in Simulink. First, the signals are generated on the left, then they are combined and passed through the FFT block mentioned earlier to finally be processed by the updated Welch algorithm. A random data source generates a proper integer numbers for the Quadrature Amplitude Modulation (QAM) modulator. Then, a general telecommunication signal is generated by up sampling and pulse shaping based on raised-cosine filtering. The numerical oscillator generates a single tone at the desired center frequency. The idea here is to show how dependency on past data (a parameter) can obtain a smoother spectrum and help to clarify the Carrier Wave Interference (CWI) [25] as a frequency component. Numerical control oscillator generates CWI based on the selected center frequency f 0 : Appl. Sci. 2021, 11, x FOR PEER REVIEW 11 of 23 The final point to address is the saturation time based on the α factor. Since the formula is exponential, the following expression can be used to estimate the number of FFT scans until a certain precision of the saturation for a noiseless signal: where ε is the precision of the saturation. The number of scans is an approximation of the total required FFT packets until partial saturation.

Simulation Setup
This section explains how the modified algorithm can be implemented simply in MATLAB Simulink as a development environment. Figure 8 displays the block diagram of the design in Simulink. First, the signals are generated on the left, then they are combined and passed through the FFT block mentioned earlier to finally be processed by the updated Welch algorithm. A random data source generates a proper integer numbers for the Quadrature Amplitude Modulation (QAM) modulator. Then, a general telecommunication signal is generated by up sampling and pulse shaping based on raised-cosine filtering. The numerical oscillator generates a single tone at the desired center frequency. The idea here is to show how dependency on past data ( parameter) can obtain a smoother spectrum and help to clarify the Carrier Wave Interference (CWI) [25] as a frequency component. Numerical control oscillator generates CWI based on the selected center frequency 0 : For example, in Figure 8, a single tone with a peak at the center frequency 0 = 0.015(× ) is generated. The schematic of the implanted FFT and modified algorithm can be seen in Figure  9a,b. The simulation design tries to have the similar behavior of the Xilinx DSP Generator blocks. Samples are first stored into a buffer of FFT size (NF). After buffering NF samples, the FFT block transfers the samples to the frequency domain and generates the FFT vector. For example, in Figure 8, a single tone with a peak at the center frequency f 0 = 0.015 ×π radian samp is generated. The schematic of the implanted FFT and modified algorithm can be seen in Figure 9a,b. The simulation design tries to have the similar behavior of the Xilinx DSP Generator blocks. Samples are first stored into a buffer of FFT size (N F ). After buffering N F samples, the FFT block transfers the samples to the frequency domain and generates the FFT vector. The FFT vector is then converted to a series of data with the help of the un-buffer block. A free running counter generates the index of FFT samples. Because this counter is limited to count to N F , after reaching N F , it will reset to one and start to count up again. In this way, for each packet of FFT samples, the counter generates the correspondence index from one to N F . The relational block simply compares the counter output with N F and generates a pulse if the output is equal to N F as the data ready pulse. This pulse determines the boundaries of each calculated packet in the data stream. It should be mentioned that in the Xilinx DSP Generator, all sub-blocks of the buffer, FFT calculator, un-buffer, counter and data-ready are integrated as one component. The general implementation idea is explained in Figure 9b, where the output of the RAM is multiplied with a in feedback and then added to the new FFT samples to generate the exponential weighted averaging.

Simulation Results
As it can be seen in Figure 10, FFT samples are generated by 2 × delays. In first clocks, the input data of the first segment is arriving and buffered for FFT calculation. In the second clocks, the FFT of the first input is calculated. While the FFT of the first segment is calculating, the input data of the second segment is buffering. This se-

Simulation Results
As it can be seen in Figure 10, FFT samples are generated by 2 × N F delays. In first N F clocks, the N F input data of the first segment is arriving and buffered for FFT calculation. In the second N F clocks, the FFT of the first input is calculated. While the FFT of the first segment is calculating, the input data of the second segment is buffering. This sequence continues and calculates the FFT of input segments in real time by 2 × N F delays for all segments. Another important note here is that the generated FFT samples are displayed from 0 to 2π instead of −π to π. To solve this problem, the first half of the FFT samples must be shifted after the second half to recover the standard spectrum centered at the desired frequency.
After the periodogram values are calculated, they are saved inside the RAM. Each sample is saved in a memory address based on its number in the FFT vector. For example, the 100th periodogram sample is stored in the register with an address equal to 100 in RAM. The idea for calculating a Welch sample is to perform weighted averaging before saving data in each RAM register. Remember that the counter resets with the "Data Ready" pulse, thus, at each moment, the counter value is displaying the corresponding FFT sample number. The output value of the counter is also used to determine the address of this sample inside the RAM. Before overwriting new samples on its related address, the design fetches the previous sample, weighs it with the parameter and then adds it to the new generated sample. This idea calculates the Welch value based on the weighted averaging. As time goes on, the samples saved inside the RAM converge to Welch samples. Figure 11 displays the convergence of Welch samples to the final spectrum for α = 0.99 from the simulation. As it can be seen, two spectra generated during the convergence time are displayed. The first spectrum generated is noisier and displays a spectrum of input data with fewer details. After convergence, the spectrum is completely smooth with considerably less noise. Another important note here is that the generated FFT samples are displayed from 0 to 2π instead of −π to π. To solve this problem, the first half of the FFT samples must be shifted after the second half to recover the standard spectrum centered at the desired frequency.
After the periodogram values are calculated, they are saved inside the RAM. Each sample is saved in a memory address based on its number in the FFT vector. For example, the 100th periodogram sample is stored in the register with an address equal to 100 in RAM. The idea for calculating a Welch sample is to perform weighted averaging before saving data in each RAM register. Remember that the counter resets with the "Data Ready" pulse, thus, at each moment, the counter value is displaying the corresponding FFT sample number. The output value of the counter is also used to determine the address of this sample inside the RAM. Before overwriting new samples on its related address, the design fetches the previous sample, weighs it with the α parameter and then adds it to the new generated sample. This idea calculates the Welch value based on the weighted averaging. As time goes on, the samples saved inside the RAM converge to Welch samples. Figure 11 displays the convergence of Welch samples to the final spectrum for α = 0.99 from the simulation. As it can be seen, two spectra generated during the convergence time are displayed. The first spectrum generated is noisier and displays a spectrum of input data with fewer details. After convergence, the spectrum is completely smooth with considerably less noise. To better understand the performance obtained over the FFT and the effect of the α parameter on the final spectrum, Figure 12 displays the computed spectrum with a different α. During this test, a CWI with Signal to Interference Ratio SI =R12 dB is added to the transmitted signal. For FFT samples, the peak position of CWI fades in the noise and other frequency components make it barely visible. For the Welch spectrum obtained with α = 0.25, the peak is detectable; however, the distance between the maximum peak and the other frequency components is not sufficiently visible and can be misleading for a detector. As the α parameter increases, the spectrum gets smoother and clearer. For α = 0.99, the spectrum is completely smooth, without most of the noise, and the peak appears clearly with a noticeable distance on top of the original signal.  To better understand the performance obtained over the FFT and the effect of the α parameter on the final spectrum, Figure 12 displays the computed spectrum with a different α. During this test, a CWI with Signal to Interference Ratio SI = R12 dB is added to the transmitted signal. For FFT samples, the peak position of CWI fades in the noise and other frequency components make it barely visible. For the Welch spectrum obtained with α = 0.25, the peak is detectable; however, the distance between the maximum peak and the other frequency components is not sufficiently visible and can be misleading for a detector. As the α parameter increases, the spectrum gets smoother and clearer. For α = 0.99, the spectrum is completely smooth, without most of the noise, and the peak appears clearly with a noticeable distance on top of the original signal. To better understand the performance obtained over the FFT and the effect of the α parameter on the final spectrum, Figure 12 displays the computed spectrum with a different α. During this test, a CWI with Signal to Interference Ratio SI =R12 dB is added to the transmitted signal. For FFT samples, the peak position of CWI fades in the noise and other frequency components make it barely visible. For the Welch spectrum obtained with α = 0.25, the peak is detectable; however, the distance between the maximum peak and the other frequency components is not sufficiently visible and can be misleading for a detector. As the α parameter increases, the spectrum gets smoother and clearer. For α = 0.99, the spectrum is completely smooth, without most of the noise, and the peak appears clearly with a noticeable distance on top of the original signal.  As is explained here, in the modified version, the α parameter plays the same role of the averaging number in the original Welch algorithm. The main advantage of using the modified version is the simplicity and flexibility of the averaging control. The α parameter is like a flexible control volume in the user's hand and allows the PSD parameters to be adjusted based on the current situation.

Design in Xilinx DSP Generator Library
The practical part of this research benefits from the digital signal-processing library provided in the Xilinx System Generator integrated in the Simulink environment. The following methodology contains two main steps: system performance and mathematical discovery based on the simulation that followed by transferring all simulation blocks to the Xilinx DSP Generator. The FFT blocks provided in this library are flexible for multiple FFT sample size adjustments even after synthesis while FPGA is running. The idea is that the method proposed must be practical and easily implementable in the existing technology as much as possible. Following this idea in the simulation design step and benefiting from the Xilinx DSP library, the modified design can easily be transferred from simulation to a worthy FPGA design. Thus, one can assume that the real response of the FPGA design will be reflected accurately in the simulation. Figure 13a,b represent the most important blocks used in the Xilinx DSP library. The Xilinx Fast Fourier Transform block core computes N-point forward DFT or inverse DFT based on the Cooley-Tukey FFT algorithm [26]. It buffers a vector of N complex values provided on real and imaginary input ports during the N system clocks and generates N F frequency samples in a series manner on real and imaginary output ports. For synchronization, this block also provides a synchronization signal, the "Data Ready pulse". This signal simply represents the start and end of each calculated FFT packet. Connecting this pulse to the reset port of a free counter recreates the behavior similar to the simulation step. The second block is the RAM, which is one of the main components of the modified Welch algorithm presented in this paper. The RAM size can also be adjusted to increase or decrease the resolution of the spectral representation. The desired frequency resolution can be easily adjusted with the FTT and RAM size. The smoothness and past value dependency can be set up simply by adjusting the α parameter. These two options give the user unique flexibility for dynamic system adjustment based on the current situation.
Warning: as mentioned before, sorted values inside the RAM will grow to a certain value, then, the system will converge to a final spectrum. The important point here, is the hardware limitation for saving large values inside the RAM. If the averaging value goes higher than the saturation limit of the user computational capability, then an error will be generated. Xilinx DSP generator allows the designer to adjust the precision and "fixed point" output type for each block. For the blocks that are involved in Figure 13, "number of bits" and "binary point" must be selected carefully to avoid the saturation error. During the real time test, hardware limitation must also be considered for selecting the maximum value for the a parameter.

BEEcube SDR
The NanoBEE device by National Instruments is used as a SDR receiver. This SDR is equipped with a programmable FPGA to implement the algorithm as well as a register bank module to store samples and result data [27].
Once the data is processed, the results are stored inside the register bank, then sent via Ethernet cable to a computer equipped with MATLAB for spectral representation. Warning: as mentioned before, sorted values inside the RAM will grow to a certa value, then, the system will converge to a final spectrum. The important point here, is th hardware limitation for saving large values inside the RAM. If the averaging value go higher than the saturation limit of the user computational capability, then an error will b generated. Xilinx DSP generator allows the designer to adjust the precision and "fixe point" output type for each block. For the blocks that are involved in Figure 13, "numb of bits" and "binary point" must be selected carefully to avoid the saturation error. Durin the real time test, hardware limitation must also be considered for selecting the maximu value for the parameter.

BEEcube SDR
The NanoBEE device by National Instruments is used as a SDR receiver. This SDR equipped with a programmable FPGA to implement the algorithm as well as a regist Warning: as mentioned before, sorted values inside the RAM will grow to a certain value, then, the system will converge to a final spectrum. The important point here, is the hardware limitation for saving large values inside the RAM. If the averaging value goes higher than the saturation limit of the user computational capability, then an error will be generated. Xilinx DSP generator allows the designer to adjust the precision and "fixed point" output type for each block. For the blocks that are involved in Figure 13, "number of bits" and "binary point" must be selected carefully to avoid the saturation error. During the real time test, hardware limitation must also be considered for selecting the maximum value for the parameter.

BEEcube SDR
The NanoBEE device by National Instruments is used as a SDR receiver. This SDR is equipped with a programmable FPGA to implement the algorithm as well as a register bank module to store samples and result data [27].
Once the data is processed, the results are stored inside the register bank, then sent via Ethernet cable to a computer equipped with MATLAB for spectral representation.

Practical Setup and Real-Time Test Bench
This section tries to validate the proposed algorithm by representing the results obtained in a real-time test bench. The data acquisition and setup are similar to the ones used in [27,28] and with a simulation-based debugging before FPGA synthesis. Use of a software-defined radio (SDR) allowed us to compare the results with reliable standardized spectrum monitoring devices and make a real-time comparison with them. The diagram in Figures 15 and 16 presents the laboratory setup used to validate and compare the algorithm to two standard spectrum monitoring devices. For comparison, two commercially available spectrum monitoring devices from Rohde & Schwarz [29] and Tektronix [30] are used to validate the obtained results. In the case of the Tektronix analyzer, the Long-Term Evolution (LTE) without averaging spectrum analyzer was used to display the signal even more clearly.

Practical Setup and Real-Time Test Bench
This section tries to validate the proposed algorithm by representing the results obtained in a real-time test bench. The data acquisition and setup are similar to the ones used in [27,28] and with a simulation-based debugging before FPGA synthesis. Use of a software-defined radio (SDR) allowed us to compare the results with reliable standardized spectrum monitoring devices and make a real-time comparison with them. The diagram in Figures 15 and 16 presents the laboratory setup used to validate and compare the algorithm to two standard spectrum monitoring devices. For comparison, two commercially available spectrum monitoring devices from Rohde & Schwarz [29] and Tektronix [30] are used to validate the obtained results. In the case of the Tektronix analyzer, the Long-Term Evolution (LTE) without averaging spectrum analyzer was used to display the signal even more clearly.

Final Results
The modified design was tested in the laboratory using the setup presented in Figure  16. The first result tries to show the capacity to detect every type of narrow band and wideband signals in different frequencies, like any other spectrum analyzer. A single tone at a frequency of 1.572 GHz and a modulated QAM signal with pulse shaping are selected as narrowband, and wideband signals in different center frequencies are then passed through the commercial spectrum analyzer and SDR equipped with the modified Welch algorithm. Table 1 briefly explains the parameters selected for this test.

Practical Setup and Real-Time Test Bench
This section tries to validate the proposed algorithm by representing the results obtained in a real-time test bench. The data acquisition and setup are similar to the ones used in [27,28] and with a simulation-based debugging before FPGA synthesis. Use of a software-defined radio (SDR) allowed us to compare the results with reliable standardized spectrum monitoring devices and make a real-time comparison with them. The diagram in Figures 15 and 16 presents the laboratory setup used to validate and compare the algorithm to two standard spectrum monitoring devices. For comparison, two commercially available spectrum monitoring devices from Rohde & Schwarz [29] and Tektronix [30] are used to validate the obtained results. In the case of the Tektronix analyzer, the Long-Term Evolution (LTE) without averaging spectrum analyzer was used to display the signal even more clearly.

Final Results
The modified design was tested in the laboratory using the setup presented in Figure  16. The first result tries to show the capacity to detect every type of narrow band and wideband signals in different frequencies, like any other spectrum analyzer. A single tone at a frequency of 1.572 GHz and a modulated QAM signal with pulse shaping are selected as narrowband, and wideband signals in different center frequencies are then passed through the commercial spectrum analyzer and SDR equipped with the modified Welch algorithm. Table 1 briefly explains the parameters selected for this test.

Final Results
The modified design was tested in the laboratory using the setup presented in Figure 16. The first result tries to show the capacity to detect every type of narrow band and wideband signals in different frequencies, like any other spectrum analyzer. A single tone at a frequency of 1.572 GHz and a modulated QAM signal with pulse shaping are selected as narrowband, and wideband signals in different center frequencies are then passed through the commercial spectrum analyzer and SDR equipped with the modified Welch algorithm. Table 1 briefly explains the parameters selected for this test. In regards to the fairness of competition, we considered the following points carefully: • It should be mentioned that the power transferred by the signal generator is equally divided between all devices with the help of a splitter. Thus, all devices (SDR and two other spectrum analyzers) are competing with similar input power. • Selected span for commercial devices is 10 MHz. The SDR that run the new design, has a sampling rate of 40 MHz, thus, the span of the SDR is equal to 20 MHz. Selected span for the two other devices is half of SDR, that allows them to have two times better resolution.

•
The attenuation gain for all three devices is equal to 0 dB. It allows the two other spectrum analyzers to have maximum sensitivity to the input signal. • Selected cables have the same attenuation gain.

•
During the test, we also changed the input signal of different devices to be sure that the obtained results were real. Table 2 provide more details about the commercial device settings. As it can be seen in Figure 17, the modified design can represent any spectral formation with accuracy for signal monitoring and interference detection. For both narrowband and wideband signals, the center frequencies detected match the transferred signals completely, and the bandwidth detected is equal to the bandwidth of the modulated QAM.

MHz
−20 dB 0 dB 1.57542 GHz 1 kHz As it can be seen in Figure 17, the modified design can represent any spectral formation with accuracy for signal monitoring and interference detection. For both narrowband and wideband signals, the center frequencies detected match the transferred signals completely, and the bandwidth detected is equal to the bandwidth of the modulated QAM. The next part of the laboratory results represents the effect of α on the spectrum smoothness and accuracy in frequency component recognition. Figure 18a-c represent a modulated QAM signal at 1.57542 GHz mixed with a CWI at 1.575 GHz compared to the two other spectrum analyzers. In this test, a very low-power CWI is used to make it harder to detect. In this situation, both of the commercial spectrum analyzers could not see the CWI and the modified design was able to detect the spike caused by the interference. Another important point observed in Figure 18 is that the spectrum displayed by the modified design is similar in shape to the two other spectrum analyzers with less noise, and the CWI can appear 4 dB higher than the modulated signal.  Figure 18 is that the spectrum displayed by the modified design is similar in shape to the two other spectrum analyzers with less noise, and the CWI can appear 4 dB higher than the modulated signal.
(c) Figure 18. (a) CWI and modulated QAM signal representation on the R&S spectrum analyzer; (b) CWI and modulated QAM signal representation with the modified Welch algorithm with α = 0.99 (c) CWI and modulated QAM signal representation with the Tektronix spectrum analyzer. Figure 19 represents the results obtained from SDR by repeating the second tests using lower parameters.  Figure 19 represents the results obtained from SDR by repeating the second tests using lower a parameters.
(c) Figure 18. (a) CWI and modulated QAM signal representation on the R&S spectrum analyzer; (b) CWI and modulated QAM signal representation with the modified Welch algorithm with α = 0.99 (c) CWI and modulated QAM signal representation with the Tektronix spectrum analyzer. Figure 19 represents the results obtained from SDR by repeating the second tests using lower parameters.
As it can be seen, smother spectrum is obtained for the larger a parameter. The lower the α parameter, the harder it gets to detect the CWI mixed with the modulated QAM until for α = 0.70 in Figure 19b, the CWI peak is covered by noise and is not visible.
The last test tries to compare the ability of two commercial spectrum analyzers with the modified design in weak signal monitoring. In this test, only a wideband signal with low power is used. The signal power transmitted is decreased until the spectrum analyzers are unable to monitor it. As it can be seen in the Figure 20, with a transmitted power of −87 dBm, both spectrum analyzers are unable to monitor the input power, and the transmitted spectrum is covered by noise; however, the SDR equipped with the modified design still clearly depicts signal with correct and reliable details. As it can be seen, smother spectrum is obtained for the larger a parameter. The lower the α parameter, the harder it gets to detect the CWI mixed with the modulated QAM until for α = 0.70 in Figure 19b, the CWI peak is covered by noise and is not visible.
The last test tries to compare the ability of two commercial spectrum analyzers with the modified design in weak signal monitoring. In this test, only a wideband signal with low power is used. The signal power transmitted is decreased until the spectrum analyzers are unable to monitor it. As it can be seen in the Figure 20, with a transmitted power of −87 dBm, both spectrum analyzers are unable to monitor the input power, and the transmitted spectrum is covered by noise; however, the SDR equipped with the modified design still clearly depicts signal with correct and reliable details. For both cases, narrow band and wide band, when other spectrum analyzers could not display the spectrum, the modified design could still display frequency component 5 dB higher than the noise level. With a narrow band signal such as a CWI, when its power is decreased until the point where other devices cannot see the CWI, the new design can see it 5 dB stronger than the QAM signal.
As it has been explained before, α = 0.99 can result in SNR improvement equivalent to averaging by N order of 100. Averaging reduces the noise level and improves the SNR by 20 dB gain. Thus, the frequency components can be visible clearly on top of the noise. Superiority over other spectrum analyzers is demonstrated with the SNR improvement. By decreasing the signal power, it will be fading in the background noise and commercial For both cases, narrow band and wide band, when other spectrum analyzers could not display the spectrum, the modified design could still display frequency component 5 dB higher than the noise level. With a narrow band signal such as a CWI, when its power is decreased until the point where other devices cannot see the CWI, the new design can see it 5 dB stronger than the QAM signal.
As it has been explained before, α = 0.99 can result in SNR improvement equivalent to averaging by N order of 100. Averaging reduces the noise level and improves the SNR by 20 dB gain. Thus, the frequency components can be visible clearly on top of the noise. Superiority over other spectrum analyzers is demonstrated with the SNR improvement. By decreasing the signal power, it will be fading in the background noise and commercial devices are not able to track it in the surrounding noise. However, the new design, benefiting from the SNR improvement, can display the desired signal in lower SNRs.

Discussion
The purpose of this paper was to present an easily implementable spectrum monitoring design based on a modified version of the Welch algorithm. The proposed design can accurately represent a low-power signal with a less noisy spectrum reliably when standard spectrum analyzers are unable to track the input signal. Future spectrum monitoring devices such as interference mitigation and detection systems could easily implement this rugged, low-cost and effective design. The unique flexibility allows a lot of different usages such as high-precision spectrum monitoring or high responsiveness for frequency hopping interference tracking. The α parameter helps to have a balance between the required accuracy and the desired convergence in different situations. For instance, the design cannot accurately display a noise-free spectrum while remaining very responsive to frequency hopping. Therefore, there is a trade-off between speed and accuracy when choosing the α value. For future research, the study of this design paired with an interference detection and mitigation device could yield some interesting results. The ability to clearly display low-power interference on top of a noisy signal is very promising for interference mitigation.
Another interesting research would be to adjust the α parameter based on the input signal to noise ratio. The receiver is able to have an approximation of the channel situation and input SNR. Once the receiver is in a good position, selecting a lower α value allows the system to track variable frequency components. While the system is suffering from noisy channel and low SNR, a clearer spectrum curve is available by selecting a larger α parameter.

Conclusions
In this research, a modified version of the Welch algorithm for spectrum monitoring is proposed. The characteristics of this design were thoroughly analyzed and studied. Spectrum monitoring is used in multiple applications, from interference mitigation to research and development in the scientific community. After the mathematical explanation and the presentation of the design itself, the effects of the α parameter were presented. It can be concluded that with a smaller α parameter, the system is more responsive and more appropriate for fast hopping frequency signals. On the other hand, with a larger α, the system is more accurate and is more appropriate for precise spectrum monitoring. It is safe to assume that the modified design can reliably and accurately represents any signal type as other commercial devices. From the last results we see that this design can represent low-power signal 5 dB lower than the other standard spectrum analyzers. The proposed algorithm decreases the required space on FPGA and benefiting from its practical design, it can be embedded in SDR or other FPGA core.
The main consideration for this system is the convergence time based on the selected feedback parameter, which can cause loss of information for fast variant frequency components and large values in the RAM component that can generate the saturation error.