Parallel Frequency Function-Deep Neural Network for Efficient Approximation of Complex Broadband Signals

In recent years, with the growing popularity of complex signal approximation via deep neural networks, people have begun to pay close attention to the spectral bias of neural networks—a problem that occurs when a neural network is used to fit broadband signals. An important direction taken to overcome this problem is the use of frequency selection-based fitting techniques, of which the representative work is called the PhaseDNN method, whose core idea is the use of bandpass filters to extract frequency bands with high energy concentration and fit them by different neural networks. Despite the method’s high accuracy, we found in a large number of experiments that the method is less efficient for fitting broadband signals with smooth spectrums. In order to substantially improve its efficiency, a novel candidate—the parallel frequency function-deep neural network (PFF-DNN)—is proposed by utilizing frequency domain analysis of broadband signals and the spectral bias nature of neural networks. A substantial improvement in efficiency was observed in the extensive numerical experiments. Thus, the PFF-DNN method is expected to become an alternative solution for broadband signal fitting.


Introduction
An artificial neural network, also known as a neural network, refers to a mathematical model that imitates animal neural network behavior [1], which is essentially a high-dimensional complex mapping model wherein adjusting network weights allows for feature fitting. A neural network's basic building blocks are a neuron model, a parameterized model nested by a scalar linear function, and a monotonic nonlinear function (activation function), where the coefficients in the linear function are the connection weights between neurons. Neurons connecting according to a specific topology form a neural network. One of the primary networks is the single-layer network composed of multiple neurons in parallel. Multiple single-layer networks can be stacked to obtain a multi-layer network and further expanded into a deep neural network that contains multiple multilayer networks. Some advanced neural network models are designed to meet the needs of engineering practice [2][3][4][5]. For example, convolutional neural networks (CNN) and their evolution models have achieved unprecedented success in computer vision due to their powerful feature extraction capabilities and are widely used in security monitoring, autonomous driving, human-computer interaction, augmented reality, and many other fields. Additionally, recurrent neural networks, especially the long-short-term memory (LSTM), have become mainstream tools in the fields of automatic translation, text gen ation, and video generation in just a few years.
The related research of the universal approximation [6][7][8][9][10] indicates that assumi sufficient neurons and suitable weights, the neural network could approximate any co tinuous signals on a compact subset of ℝ n with arbitrary precision. However, it is not ea to obtain these appropriate weights via training for a complex network with too ma neurons. The convergence speed of the neural network is related to the frequency sp trum of the fitted signal [11]. As shown in Figure 1, the neural network first learns t low-frequency components during the training process. The relationship between the co vergence speed and the frequency of the fitted signal has been quantitatively analyz [12]. When a network is applied to fit a signal, the required training time increases exp nentially as the component's central frequency increases. The spectral bias of the conv gence speed in network training leads to unbearable training times for fitting the hig frequency components in broadband signals. In recent years, differential equation solvi based on physical knowledge constraints has been successful in many applications, b multi-scale and multi-physics problems need further development. One of the core pro lems is that it is difficult for fully connected neural networks to learn high-frequency fun tions, that is, the spectral bias problem [13]. Solving this problem has become one of t core bottlenecks for the further development of A.I. technology. To this end, Cai et al. proposed the PhaseDNN method for fitting complex sign with high-frequency components via the combination of parallel frequency band extr tion and frequency shifting techniques [14]. Numerous numerical experimental resu have indicated that the PhaseDNN method successfully avoids the computational c disaster when fitting signals with high-frequency components. However, although t method has good approximation efficiency for broadband signals, we found in our n merical experiments that for a large number of typical signals, especially those w smooth frequency domain distribution, the operation of fitting the inverse transform signals can be further optimized, thus greatly improving the neural network fitting e ciency. To this end, Cai et al. proposed the PhaseDNN method for fitting complex signals with high-frequency components via the combination of parallel frequency band extraction and frequency shifting techniques [14]. Numerous numerical experimental results have indicated that the PhaseDNN method successfully avoids the computational cost disaster when fitting signals with high-frequency components. However, although the method has good approximation efficiency for broadband signals, we found in our numerical experiments that for a large number of typical signals, especially those with smooth frequency domain distribution, the operation of fitting the inverse transformed signals can be further optimized, thus greatly improving the neural network fitting efficiency. Therefore, our paper is dedicated to investigating a more efficient candidate method for fitting complex signals-signals with smooth frequency domain distribution. To reach this goal, a parallel frequency function-deep neural network (PFF-DNN) is proposed by utilizing the fast Fourier analysis of broadband signals and the spectral bias nature of neural networks. The effectiveness and efficiency of the proposed method are verified based on detailed experiments for six typical broadband signals. The discussion shows how the neural network adaptively constructs low-frequency smooth curves to interpolate discrete signals. This adaptive low-frequency approximation makes it possible to fit discrete frequency domain signals accurately.
The paper is organized as follows. Related works and the proposed method are introduced in Section 2. Extensive numerical experiments are presented in Section 3. Finally, the paper is concluded in Section 4.

Recent PhaseDNN
The research mentioned above indicates that it is difficult for neural networks to directly and accurately fit signals with high-frequency components. However, if we can transform the high-frequency components in the signal into smooth low-frequency signals convenient for neural network fitting, by some means such as the application of frequency shift technology, we can approximate the original signal via component-by-component fitting. Following this idea, in 2020, Cai et al. proposed the PhaseDNN method to implement a fitting scheme for signals with high-frequency components [14]. As shown in Figure 2, the PhaseDNN method involves four steps: (1) all high-frequency components in the original objective signal are extracted; (2) each high-frequency component is converted into a low-frequency signal using frequency shift technology; (3) different neural networks with identical structure are used to fit these low-frequency signals in parallel; and (4) inverse frequency shift operations are performed for all network predictions to obtain approximated high-frequency components, which are further summed up to recover the original signal. Therefore, our paper is dedicated to investigating a more efficient candidate m for fitting complex signals-signals with smooth frequency domain distribution. To this goal, a parallel frequency function-deep neural network (PFF-DNN) is propos utilizing the fast Fourier analysis of broadband signals and the spectral bias nature o ral networks. The effectiveness and efficiency of the proposed method are verified on detailed experiments for six typical broadband signals. The discussion shows ho neural network adaptively constructs low-frequency smooth curves to interpolate di signals. This adaptive low-frequency approximation makes it possible to fit discret quency domain signals accurately.
The paper is organized as follows. Related works and the proposed method a troduced in Section 2. Extensive numerical experiments are presented in Section 3. Fi the paper is concluded in Section 4.

Recent PhaseDNN
The research mentioned above indicates that it is difficult for neural networks rectly and accurately fit signals with high-frequency components. However, if w transform the high-frequency components in the signal into smooth low-frequenc nals convenient for neural network fitting, by some means such as the application o quency shift technology, we can approximate the original signal via component-by ponent fitting. Following this idea, in 2020, Cai et al. proposed the PhaseDNN meth implement a fitting scheme for signals with high-frequency components [14]. As s in Figure 2, the PhaseDNN method involves four steps: (1) all high-frequency compo in the original objective signal are extracted; (2) each high-frequency component is verted into a low-frequency signal using frequency shift technology; (3) different n networks with identical structure are used to fit these low-frequency signals in pa and (4) inverse frequency shift operations are performed for all network predictio obtain approximated high-frequency components, which are further summed up cover the original signal.     Figure 3 shows a comparison between the fitted results from the PhaseDNN method and those obtained via vanilla fitting. As shown in Figure 3a, vanilla fitting cannot recover high-frequency components well. If the signal oscillates even faster, the vanilla fitting becomes entirely ineffective. On the contrary, as shown in Figure 3c,d, the recent PhaseDNN method has shown obvious advantages in being able to completely characterize the information of all frequency bands of the signal.
Sensors 2022, 22, x FOR PEER REVIEW 4 of becomes entirely ineffective. On the contrary, as shown in Figure 3c,d, the rece PhaseDNN method has shown obvious advantages in being able to completely characte ize the information of all frequency bands of the signal. It should be noted that the extracted frequency bandwidth needs to be minimized ensure fitting accuracy. On the one hand, the smaller the extracted frequency band's wid Δω, the higher the fitting accuracy. On the other hand, a smaller Δω indicates more fr quency bands are extracted, which directly leads to an increase in computational ove head. When using PhaseDNN for broadband signal fitting, it is often necessary to extra all frequency bands and fit them separately. Therefore, there is a balance between acc racy and computational overhead. For example, consider the task of performing neur network fitting on a signal with a bandwidth of 300 Hz. When Δω = 10, considering t existence of both real and imaginary parts in the spectrum and the conjugate symmetr 30 × 2 groups of neural networks need to be trained, which is an acceptable computation overhead. However, for the task of neural network fitting on a signal with a bandwid of 3000 Hz, 3000 × 2 groups of neural networks need to be trained, which significant increases the computational overhead.

The Proposed PFF-DNN Method
It is preferable to determine a method that can ensure fitting accuracy without r quiring a large number of extracted frequency signals to be selected. Using the fast Fouri transformation (FFT) of broadband signals [15], the digital spectrum of the signal can obtained much more efficiently. It is conceivable that if we perform a piecewise fitting o a signal's digital spectrum, no computational overhead is required for frequency selectio or frequency shift. This avoids several redundant operations in the PhaseDNN metho It should be noted that the extracted frequency bandwidth needs to be minimized to ensure fitting accuracy. On the one hand, the smaller the extracted frequency band's width ∆ω, the higher the fitting accuracy. On the other hand, a smaller ∆ω indicates more frequency bands are extracted, which directly leads to an increase in computational overhead. When using PhaseDNN for broadband signal fitting, it is often necessary to extract all frequency bands and fit them separately. Therefore, there is a balance between accuracy and computational overhead. For example, consider the task of performing neural network fitting on a signal with a bandwidth of 300 Hz. When ∆ω = 10, considering the existence of both real and imaginary parts in the spectrum and the conjugate symmetry, 30 × 2 groups of neural networks need to be trained, which is an acceptable computational overhead. However, for the task of neural network fitting on a signal with a bandwidth of 3000 Hz, 3000 × 2 groups of neural networks need to be trained, which significantly increases the computational overhead.

The Proposed PFF-DNN Method
It is preferable to determine a method that can ensure fitting accuracy without requiring a large number of extracted frequency signals to be selected. Using the fast Fourier transformation (FFT) of broadband signals [15], the digital spectrum of the signal can be obtained much more efficiently. It is conceivable that if we perform a piecewise fitting on a signal's digital spectrum, no computational overhead is required for frequency selection or frequency shift. This avoids several redundant operations in the PhaseDNN method. In addition, when the frequency spectrum of the signal is not overcomplicated, the efficiency of fitting in the frequency domain will significantly improve. The method proposed here is abbreviated as the PFF-DNN method, and the details of its construction are as follows: The objective signal is denoted as a real-valued f (x) in the domain of [x 0 , x n−1 ]. From a digital sampling system, one can obtain the discrete value of the signal f (x) at the sampling points {x 0 , x 1 , . . . , x n−1 }. Here, the sampling points are assumed to be uniformly distributed on the interval of [x 0 , x n−1 ], and the discrete value of the signal f (x) at the sampling points are denoted as f 0 , f 1 , . . . , f n−1 . One can calculate the frequency spectrum F(ω) as: where k = 0, 1, . . . , n -1, and ω k is evenly distributed. The adjacent interval between ω k depends on the sampling interval. F(ω) is conjugate symmetric when the signal is real-valued. Here, {F k } is divided into m segments of length ∆ω in order, in which n = m∆ω: where In the following, we define S i as the ith segment of F(ω). Considering the conjugate symmetry, only half of {F k } needs fitting. Further considering that the sampling frequency is much larger than the bandwidth of the signal (ω b = b∆ω), Equation (2) can be approximated by: where and in which * stands for the conjugate operation. For each slice S i , a neural network is used to approximate the information T i contained. More precisely: where i = 1, 2, . . . , b. One can see from Equation (7) that, different from PhaseDNN [14], in the proposed method, each neural network T i is used to "memorize" the discrete data in the frequency domain. After obtaining these trained neural networks and their predictions {T i }, the following concatenation operation is used to obtain the approximation of F(ω): Finally, each sampling signals f j can be approximated using the following inverse FFT (IFFT): Let us compare the recent PhaseDNN method and the proposed PFF-DNN method in terms of ease of operation. In PhaseDNN, 2m convolutions are required to acquire all frequency-selected signals. Then, 2m frequency shifts and 2m inverse frequency shifts are also involved. In comparison, the proposed method is more straightforward, and only one FFT and one IFFT are required. No frequency selection, frequency shifts, nor inverse shifts are required.
In the PhaseDNN method, DNNs are used to fit continuous signals in the time domain. On the contrary, in the proposed method, DNNs are used to "memorize" discrete data points in the frequency domain. For many commonly used signals (including the signals recommended in [14] to demonstrate the effectiveness of the PhaseDNN method), detailed numerical comparisons are conducted to show that the neural network "memorizes" such discrete frequency values much faster.

Experimental Setting
Six typical signal analyses were performed to evaluate the fitting performance of our PFF-DNN method, accompanied by a comparison with the existing PhaseDNN method. The signals to be analyzed included some periodic and non-periodic analog signals in the form of explicit functions and complex signals described by the classical Mackey-Glass differential dynamic system. For each signal, detailed comparative evaluations were conducted to describe the effectiveness and efficiency of the different methods for fitting the broadband signals with high-frequency components. The detailed comparisons included (1) convergence curves (the changes of the loss R concerning the number of updates N of the neural network weights); (2) the convergence process of different frequency bands; (3) the influence of different ∆ω on the convergence speed; and (4) the influence of different updating times N on the fitting accuracy.
Before showing the experimental results, we first introduce the neural network model and training-related parameters for all subsequent experiments. The neural network structure used in this study was consistent with the network used in [14] (i.e., the 1-40-40-40-40-1 fully connected neural network). Each layer (except the output layer) used the 'ELU' activation function [16]. The neural network training adopted the Adam optimizer [17] with a learning rate of 0.0002. The length of each training batch was 100. For the method in [14], 5000 training data and 5000 testing data were evenly distributed across the domain. For the method proposed in this paper, each slice's length of the training sample depended on the slice's length in the frequency domain. For both methods, we recorded the approximation error in the root mean square error (RMSE) when ∆ω of the slice in the frequency domain was 11, 21, 31, 41, or 51, and when the total number of updates were from 1 to 10,000. The ∆ω used in the literature [14] was 11. Therefore, we gradually increased ∆ω from this value. When ∆ω increases to infinite, both methods will degenerate into the vanilla fitting method. Therefore, we set the upper limit of ∆ω to 51. The numerical experiments were completed using a PC with an Intel-core-i9-9900X CPU @ 3.50 GHz with 128 GB-RAM and a Titan X GPU.

Sine on Polynomial Signal
Here, the signal of sine on polynomial was used to test the neural network's approximation accuracy, which is similar to the function used in the literature [18]. It is described by Equation (10), whose shape and the corresponding frequency spectrum are shown in Figure 4. The signal consists of a high-frequency periodic component and a low-frequency non-periodic component.   On the contrary, if the PhaseDNN or PFF-DNN method was used, the signal was fitted very well. As mentioned earlier, a larger Δω indicates fewer neural networks used for fitting, which is desirable. For the current example, the bandwidth of the signal is 50 Hz. When Δω = 11, to approximate the signal precisely, one needs to use at least five networks for fitting. In comparison, when Δω = 51, one only needs two networks for fitting, which saves 80% of the computation resources. However, an increase in Δω makes the fitting harder. Figure 6 shows how the convergence process of the PhaseDNN and PFF-DNN methods changed with Δω. It was shown that both algorithms converged fast when Δω was small. However, when Δω increased, the convergence speed of both methods slowed down. Note that, compared with the PhaseDNN method, the PFF-DNN algorithm was less sensitive to the increase in Δω. In detail, for the PhaseDNN method, when Δω = 11, the network updated 10,000 times to reach convergence. However, when Δω = 41, it was difficult to converge even if it was trained 100,000 times. In contrast, for the PFF-DNN method, when Δω = 41, it converged within only 2000 updates. That is to say, when Δω tripled, the convergence time only doubled. This is desirable in practical applications.   On the contrary, if the PhaseDNN or PFF-DNN method was used, the signal was fitted very well. As mentioned earlier, a larger Δω indicates fewer neural networks used for fitting, which is desirable. For the current example, the bandwidth of the signal is 50 Hz. When Δω = 11, to approximate the signal precisely, one needs to use at least five networks for fitting. In comparison, when Δω = 51, one only needs two networks for fitting, which saves 80% of the computation resources. However, an increase in Δω makes the fitting harder. Figure 6 shows how the convergence process of the PhaseDNN and PFF-DNN methods changed with Δω. It was shown that both algorithms converged fast when Δω was small. However, when Δω increased, the convergence speed of both methods slowed down. Note that, compared with the PhaseDNN method, the PFF-DNN algorithm was less sensitive to the increase in Δω. In detail, for the PhaseDNN method, when Δω = 11, the network updated 10,000 times to reach convergence. However, when Δω = 41, it was difficult to converge even if it was trained 100,000 times. In contrast, for the PFF-DNN method, when Δω = 41, it converged within only 2000 updates. That is to say, when Δω tripled, the convergence time only doubled. This is desirable in practical applications. On the contrary, if the PhaseDNN or PFF-DNN method was used, the signal was fitted very well. As mentioned earlier, a larger ∆ω indicates fewer neural networks used for fitting, which is desirable. For the current example, the bandwidth of the signal is 50 Hz. When ∆ω = 11, to approximate the signal precisely, one needs to use at least five networks for fitting. In comparison, when ∆ω = 51, one only needs two networks for fitting, which saves 80% of the computation resources. However, an increase in ∆ω makes the fitting harder. Figure 6 shows how the convergence process of the PhaseDNN and PFF-DNN methods changed with ∆ω. It was shown that both algorithms converged fast when ∆ω was small. However, when ∆ω increased, the convergence speed of both methods slowed down. Note that, compared with the PhaseDNN method, the PFF-DNN algorithm was less sensitive to the increase in ∆ω. In detail, for the PhaseDNN method, when ∆ω = 11, the network updated 10,000 times to reach convergence. However, when ∆ω = 41, it was difficult to converge even if it was trained 100,000 times. In contrast, for the PFF-DNN method, when ∆ω = 41, it converged within only 2000 updates. That is to say, when ∆ω tripled, the convergence time only doubled. This is desirable in practical applications.  Figure 7 shows the fitting results of different frequency bands. Since the fre spectrum for the neural network to approximate is not overcomplicated, the perfo of the PFF-DNN method was better than that of the PhaseDNN method.    Figure 7 shows the fitting results of different frequency bands. Since the frequency spectrum for the neural network to approximate is not overcomplicated, the performance of the PFF-DNN method was better than that of the PhaseDNN method.  Figure 7 shows the fitting results of different frequency bands. Since the frequency spectrum for the neural network to approximate is not overcomplicated, the performance of the PFF-DNN method was better than that of the PhaseDNN method.  Figure 8 shows the influence of Δω on the algorithm performance. As shown in Figure 8c,f, when Δω = 11, both methods fit the signal well. However, when Δω increased to 31, the PhaseDNN method began to show fitting errors. When Δω = 51, the PhaseDNN method was hardly able to fit the signal effectively. In comparison, the PFF-DNN method was able to still accurately fit the signal.  to 31, the PhaseDNN method began to show fitting errors. When ∆ω = 51, the PhaseDNN method was hardly able to fit the signal effectively. In comparison, the PFF-DNN method was able to still accurately fit the signal.
x FOR PEER REVIEW 9 of 22  Figure 9 shows how the number of updates N influences the algorithm performance. In the beginning, both algorithms were unable to fit the high-frequency components very well. The comparison between Figure 9b,e shows that the PFF-DNN method converged faster compared with the PhaseDNN method. The red curves in Figure 6 also confirm this point.     Figure 9 shows how the number of updates N influences the algorithm performance. In the beginning, both algorithms were unable to fit the high-frequency components very well. The comparison between Figure 9b,e shows that the PFF-DNN method converged faster compared with the PhaseDNN method. The red curves in Figure 6 also confirm this point.

ENSO Signal
The signal determined by the following equation is often used to approximate the

ENSO Signal
The signal determined by the following equation is often used to approximate the ENSO data set [19]: f (x) = 4.7 cos(2πx/12) + 1.1 sin(2πx/12) +0.2 cos(2πx/1.7) + 2.7 sin(2πx/1.7) +2.1 cos(2πx/0.7) + 2.1 sin(2πx/0.7) −0.5 (11) This signal is more complicated than the one described by Equation (10). The shape of the signal and its corresponding frequency spectrum is shown in Figure 10. π π π π π π = + + + + + − , This signal is more complicated than the one described by Equation (10). The s of the signal and its corresponding frequency spectrum is shown in Figure 10.   According to the trend of the solid red line and the red dashed line in Figure 11, it was shown that the convergence speed of the PFF-DNN method was about an order of magnitude larger. With the increase in ∆ω, the convergence time of the PFF-DNN method only doubled. Thus, the larger ∆ω, the more training time the PFF-DNN method saves as compared to the PhaseDNN method. Figures 12 and 13 visually show the convergence process of the two networks.
This signal is more complicated than the one described by Equation (10). The shape of the signal and its corresponding frequency spectrum is shown in Figure 10. According to the trend of the solid red line and the red dashed line in Figure 11, it was shown that the convergence speed of the PFF-DNN method was about an order of magnitude larger. With the increase in Δω, the convergence time of the PFF-DNN method only doubled. Thus, the larger Δω, the more training time the PFF-DNN method saves as compared to the PhaseDNN method. Figures 12 and 13 visually show the convergence process of the two networks. Figure 11. The influence of Δω on the convergence process. Figure 11. The influence of ∆ω on the convergence process.

Signal with Increasing Frequency
Next, a signal with fast frequency changes was analyzed. This signal is often used in system identification [20], and its shape and frequency spectrum are shown in Figure 14. The explicit expression of the signal used here is: in which, T = 1, f 0 = 0.01, and f T = 50. Figure 14a shows that as x increases, the oscillation frequency of the signal increases sharply. Figure 14b shows that the amplitude and the oscillation frequency of the signal's spectrum gradually decrease.

Signal with Increasing Frequency
Next, a signal with fast frequency changes was analyzed. This signal is often used in system identification [20], and its shape and frequency spectrum are shown in Figure 14. The explicit expression of the signal used here is: in which, T = 1, f0 = 0.01, and fT = 50. Figure 14a shows that as x increases, the oscillation frequency of the signal increases sharply. Figure 14b shows that the amplitude and the oscillation frequency of the signal's spectrum gradually decrease. From the curve in Figure 15, it is shown that when Δω = 11, the efficiency and accuracy of the PFF-DNN method were better than that of the PhaseDNN method. When Δω increased, the convergence speed of both methods slowed down rapidly. Note that the convergence speed of the PFF-DNN network was relatively faster. Figures 16 and 17 visually compare the convergence process of the two methods.  From the curve in Figure 15, it is shown that when ∆ω = 11, the efficiency and accuracy of the PFF-DNN method were better than that of the PhaseDNN method. When ∆ω increased, the convergence speed of both methods slowed down rapidly. Note that the convergence speed of the PFF-DNN network was relatively faster. Figures 16 and 17 visually compare the convergence process of the two methods. Next, a signal with fast frequency changes was analyzed. This signal is often used in system identification [20], and its shape and frequency spectrum are shown in Figure 14 The explicit expression of the signal used here is: (12 in which, T = 1, f0 = 0.01, and fT = 50. Figure 14a shows that as x increases, the oscillation frequency of the signal increases sharply. Figure 14b shows that the amplitude and the oscillation frequency of the signal's spectrum gradually decrease. From the curve in Figure 15, it is shown that when Δω = 11, the efficiency and accu racy of the PFF-DNN method were better than that of the PhaseDNN method. When Δω increased, the convergence speed of both methods slowed down rapidly. Note that the convergence speed of the PFF-DNN network was relatively faster. Figures 16 and 17 vis ually compare the convergence process of the two methods.

Piecewise Signal
The piecewise signal was used in [14] to test the neural networks' performance. The shape and its corresponding frequency spectrum are shown in Figure 18. The signal is described by:

Piecewise Signal
The piecewise signal was used in [14] to test the neural networks' performance. The shape and its corresponding frequency spectrum are shown in Figure 18. The signal is described by:

Piecewise Signal
The piecewise signal was used in [14] to test the neural networks' performance. The shape and its corresponding frequency spectrum are shown in Figure 18. The signal is described by:  The spectrum of this signal has several obvious spikes and many slight oscillations. Figure 19 shows that for such a signal, the accuracy of the proposed method was much less sensitive to changes in Δω. Figures 20 and 21 visually compare the convergence process of the two networks.  The spectrum of this signal has several obvious spikes and many slight oscillations. Figure 19 shows that for such a signal, the accuracy of the proposed method was much less sensitive to changes in ∆ω. Figures 20 and 21 visually compare the convergence process of the two networks.  The spectrum of this signal has several obvious spikes and many slight osci Figure 19 shows that for such a signal, the accuracy of the proposed method wa less sensitive to changes in Δω. Figures 20 and 21 visually compare the converge cess of the two networks.

Square Wave Signal
The following example was used to test the proposed method's approximation accuracy on discontinuous signals such as square waves [14]. The shape and its corresponding frequency spectrum are shown in Figure 22. The signal is described by:

Square Wave Signal
The following example was used to test the proposed method's approximation accuracy on discontinuous signals such as square waves [14]. The shape and its corresponding frequency spectrum are shown in Figure 22. The signal is described by:

Square Wave Signal
The following example was used to test the proposed method's approximation accuracy on discontinuous signals such as square waves [14]. The shape and its corresponding frequency spectrum are shown in Figure 22. The signal is described by:  The spectrum of this signal is composed of many irregular spikes. From Figure 23, it is shown that the increase in Δω had a limited impact on the PFF-DNN method but had a significant impact on the PhaseDNN method. Figures 24 and 25 visually compare the convergence process of the two networks.  The spectrum of this signal is composed of many irregular spikes. From Figure 23, it is shown that the increase in ∆ω had a limited impact on the PFF-DNN method but had a significant impact on the PhaseDNN method. Figures 24 and 25 visually compare the convergence process of the two networks.

Dynamic System
The Mackey-Glass equation is a time-delayed differential equation first proposed to model white blood cell production [21]. This equation has been used to test signal approximations in many related works [22][23][24]. It is described by the solution of the following lagged differential equation:

Dynamic System
The Mackey-Glass equation is a time-delayed differential equation first proposed to model white blood cell production [21]. This equation has been used to test signal approximations in many related works [22][23][24]. It is described by the solution of the following lagged differential equation:

Dynamic System
The Mackey-Glass equation is a time-delayed differential equation first proposed to model white blood cell production [21]. This equation has been used to test signal approximations in many related works [22][23][24]. It is described by the solution of the following lagged differential equation: .
The shape of the signal and its corresponding frequency spectrum is shown in Figure 26.
The shape of the signal and its corresponding frequency spectrum is shown in Figure  26. As shown, this signal's frequency spectrum is more complicated. It is distributed irregularly within the interval of [0,200]. Thus, this problem is relatively hard for both the PhaseDNN and PFF-DNN method. Figure 27 shows that when Δω = 21 and Δω = 41, the training advantage of the PFF-DNN over the PhaseDNN method was relatively obvious. However, when Δω = 11 and Δω = 51, the advantage was less obvious. Figures 28 and 29 show that in the middle of training, the convergence speed of the PFF-DNN method was also slightly better.  As shown, this signal's frequency spectrum is more complicated. It is distributed irregularly within the interval of [0, 200]. Thus, this problem is relatively hard for both the PhaseDNN and PFF-DNN method. Figure 27 shows that when ∆ω = 21 and ∆ω = 41, the training advantage of the PFF-DNN over the PhaseDNN method was relatively obvious. However, when ∆ω = 11 and ∆ω = 51, the advantage was less obvious. Figures 28 and 29 show that in the middle of training, the convergence speed of the PFF-DNN method was also slightly better.
The shape of the signal and its corresponding frequency spectrum is shown in Figure  26. As shown, this signal's frequency spectrum is more complicated. It is distributed ir regularly within the interval of [0, 200]. Thus, this problem is relatively hard for both the PhaseDNN and PFF-DNN method. Figure 27 shows that when Δω = 21 and Δω = 41, the training advantage of the PFF DNN over the PhaseDNN method was relatively obvious. However, when Δω = 11 and Δω = 51, the advantage was less obvious. Figures 28 and 29 show that in the middle o training, the convergence speed of the PFF-DNN method was also slightly better.    From this example, it is observed that when the signal's frequency spectrum is relatively complicated, the performance gain of the PFF-DNN method was not significant.   From this example, it is observed that when the signal's frequency spectrum is relatively complicated, the performance gain of the PFF-DNN method was not significant. From this example, it is observed that when the signal's frequency spectrum is relatively complicated, the performance gain of the PFF-DNN method was not significant. Therefore, we believe that the PFF-DNN and PhaseDNN methods may have their advantages for different applications.

Further Discussions
From the above experiments of fitting a large number of typical signals of different categories, it was observed that the proposed method had two advantages when the frequency domain distribution of the signal is smoother: (1) the proposed method had obvious training efficiency advantages without losing accuracy; (2) the proposed method was less sensitive to the choice of bandwidth. On the surface, this was due to the smoothness of the signal frequency domain distribution, but the deeper reason comes from the difference in complexity of the same segment of the signal in different domains.
In the following, we explain why the PFF-DNN method performed better in the above examples from the perspective of the spectral bias of neural networks. To this end, Figure 30 shows how the PhaseDNN and PFF-DNN methods approximated the third frequency band when ∆ω = 11 during the training process of the example in Section 3.2.3. Figure 30a,b shows that the PhaseDNN method essentially handled a continuous signal in the time domain, while the PFF-DNN method essentially interpolated a frequency domain discrete signal. Figure 30d,f shows the approximated results from the PFF-DNN method as a continuous function, which means that the PFF-DNN method interpolated the discrete spectrum with a smooth curve. Compared with Figure 30c,e, it is observed that the curve approximated by the PhaseDNN method oscillated more frequently in the time domain. Thus, it was more difficult for the neural network to learn the signals in the time domain by considering the spectral bias. This discussion may explain why the PFF-DNN method performed better than the PhaseDNN method for these examples. Therefore, we believe that the PFF-DNN and PhaseDNN methods may have their advantages for different applications.

Further Discussions
From the above experiments of fitting a large number of typical signals of different categories, it was observed that the proposed method had two advantages when the frequency domain distribution of the signal is smoother: (1) the proposed method had obvious training efficiency advantages without losing accuracy; (2) the proposed method was less sensitive to the choice of bandwidth. On the surface, this was due to the smoothness of the signal frequency domain distribution, but the deeper reason comes from the difference in complexity of the same segment of the signal in different domains.
In the following, we explain why the PFF-DNN method performed better in the above examples from the perspective of the spectral bias of neural networks. To this end, Figure 30 shows how the PhaseDNN and PFF-DNN methods approximated the third frequency band when Δω = 11 during the training process of the example in Section 3.2.3. Figure 30a,b shows that the PhaseDNN method essentially handled a continuous signal in the time domain, while the PFF-DNN method essentially interpolated a frequency domain discrete signal. Figure 30d,f shows the approximated results from the PFF-DNN method as a continuous function, which means that the PFF-DNN method interpolated the discrete spectrum with a smooth curve. Compared with Figure 30c,e, it is observed that the curve approximated by the PhaseDNN method oscillated more frequently in the time domain. Thus, it was more difficult for the neural network to learn the signals in the time domain by considering the spectral bias. This discussion may explain why the PFF-DNN method performed better than the PhaseDNN method for these examples.

Conclusions
Due to the existence of spectral bias, problems occur when a neural network is used for the approximation of broadband signals. That is, the high-frequency parts are hard to fit. For this reason, people often use frequency-selective filters to extract signals in different frequency bands and then use different neural networks to fit them. However, we found in a large number of numerical experiments that this method is inefficient for fitting signals with broad and smooth spectrums. For this reason, this paper proposed a novel alternative method based on parallel fitting in the frequency domain by utilizing frequency domain analysis of broadband signals and the spectral bias nature of neural networks. A substantial improvement in efficiency was observed in extensive numerical experiments. Thus, the PFF-DNN method is expected to become an alternative solution for broadband signal fitting.
In the future, we will further investigate how to determine whether the same signal is easier to fit by a neural network in the time domain or frequency domain and extend our findings to more complex representation spaces.
Funding: This research was funded by the National Natural Science Foundation of China, grant numbers 61805185, 11802225, and 51805397.