An Adaptive Cuto ﬀ Frequency Selection Approach for Fast Fourier Transform Method and Its Application into Short-Term Tra ﬃ c Flow Forecasting

: Historical measurements are usually used to build assimilation models in sequential data assimilation (S-DA) systems. However, they are always disturbed by local noises. Simultaneously, the accuracy of assimilation model construction and assimilation forecasting results will be a ﬀ ected. The fast Fourier transform (FFT) method can be used to acquire de-noised historical tra ﬃ c ﬂow measurements to reduce the inﬂuence of local noises on constructed assimilation models and improve the accuracy of assimilation results. In the practical signal de-noising applications, the FFT method is commonly used to de-noise the noisy signal with known noise frequency. However, knowing the noise frequency is di ﬃ cult. Thus, a proper cuto ﬀ frequency should be chosen to separate high-frequency information caused by noises from the low-frequency part of useful signals under the unknown noise frequency. If the cuto ﬀ frequency is too high, too much noisy information will be treated as useful information. Conversely, if the cuto ﬀ frequency is too low, part of the useful information will be lost. To solve this problem, this paper proposes an adaptive cuto ﬀ frequency selection (A-CFS) method based on cross-validation. The proposed method can determine a proper cuto ﬀ frequency and ensure the quality of de-noised outputs for a given dataset using the FFT method without noise frequency information. Experimental results of real-world tra ﬃ c ﬂow data measurements in a sub-area of a highway near Birmingham, England, demonstrate the superior performance of the proposed A-CFS method in noisy information separation using the FFT method. The di ﬀ erences between true and predicted tra ﬃ c ﬂow values are evaluated using the mean absolute error (MAE), root mean square error (RMSE), and mean absolute percentage (MAPE) values. Compared to the results of the two commonly used de-noising methods, i.e., discrete wavelet transform (DWT) and ensemble empirical mode decomposition (EEMD) methods, the short-term tra ﬃ c ﬂow forecasting results of the proposed A-CFS method are much more reliable. In terms of the MAE value, the average relative improvements of the assimilation model built using the proposed method are 19.26%, 3.47%, and 4.25%, compared to the model built using raw data, DWT method, and EEMD method, respectively; the corresponding average relative improvements in RMSE are 19.05%, 5.36%, and 3.02%, respectively; lastly, the corresponding average relative improvements in MAPE are 18.88%, 2.83%, and 2.28%, respectively. The test results show that the proposed method is e ﬀ ective in separating noises from historical measurements and can improve the accuracy of assimilation model construction and assimilation forecasting results.


Introduction
Data assimilation (DA) represents an important method in spatial science. Physical dynamic models and measurements are two fundamental approaches to acquire natural phenomena and laws in spatial science [1][2][3][4]. However, dynamic models and measurements have their own advantages and disadvantages. For instance, simulation of a dynamic model can continuously represent the characteristics of the system state vectors in space and time, but it is difficult to describe all of the characteristics of real state vectors accurately [5][6][7][8]. Measurements can represent real values of observed objects at the time and space of observation. However, it is difficult to obtain continuous measurements in time and space. Furthermore, different measurement methods have different measurement errors, which affect the evolution representation of various processes in the spatial system [6,7]. DA can estimate the state vectors by integrating the strengths of physical-model information and measurements while considering the data distribution in time and space as well as measurements and background field errors [9]. The fundamental idea of DA is to combine dynamic models and measurements so that they can mutually interact to achieve a more accurate estimation and prediction of the state vectors in spatial science. DA plays a significant role in meteorology, oceanography, hydrology, and land surface systems [1][2][3]. Recently, a DA-related system based on Bayesian theory has been used for short-term traffic state predictions, and good results were achieved [10,11]. Short-term traffic flow forecasting is a common research topic and is extremely important in many intelligent transportation systems, especially in dynamic traffic management systems [12][13][14][15]. Unlike the long-term traffic flow forecasting methods, where prediction intervals of traffic flow are usually hours, days, months, quarters, or even years, short-term traffic flow forecasting refers to predicting the traffic flow using the information collected in a short time interval, for instance, 1 min, 5 min, 10 min, 15 min, or 30 min [16][17][18][19][20]. This forecasting is widely used in traffic control and guidance.
DA systems can estimate the short-term traffic flow by integrating physical-model information and measurements while considering data distribution in both time and space, as well as measurements and background field errors [21]. A DA system mainly consists of three parts: assimilation models, including state models and measurement models, measurements, and assimilation methods. The mathematical expression of DA is as follows [21]: The state vector X k in discrete-time index k is defined by the dynamic state equation, i.e., Equation (1), where M k,k−1 denotes the dynamic state transition model. In Equation (2), which is an observation equation, H k denotes the time-dependent observational operator that connects state X k and measurements y k , w k and v k denote the Gaussian random noise series with w k ∼ N(0, Q k ) and v k ∼ N(0, R k ), which are completely independent of each other. In Equation (1), G k,k−1 denotes a coefficient matrix. As the connection of assimilation models and measurements, assimilation methods can be generally classified into sequential assimilation methods and continuous assimilation methods. A DA system based on sequential assimilation methods is named the sequential data assimilation (S-DA) system.
However, it is difficult and challenging to predict short-term traffic flow accurately using an S-DA system because the short-term traffic flow has stochastic nature and is always corrupted by local noises [22]. Short-term traffic flow values can better preserve the underlying patterns of traffic-flow variation tendency and present a more real transition of traffic conditions than long-term traffic flow values. Assuming that traffic flow patterns change over a period of a week, changing patterns in historical measurements are commonly similar at the same characteristic days and at the same time intervals in continuous weeks. The regularity in historical measurements can be used to construct assimilation models in an S-DA system, such as the vector autoregressive (VAR) model [23]. But due to human or instrument errors, as well as stochastic features of short-term traffic flow values, such as undesirable traffic accident values, random changes can occur. These local noises in historical measurements usually make it difficult to abstract underlying patterns of traffic flow data for model construction precisely. Some methods have been proposed for dealing with these noises. One of them, including the filtering series methods, such as the mean filtering algorithm [24], non-local means filter algorithm [25], and median filter algorithm [26], directly smooth the noises in the time domain. However, it is often difficult to determine the window size of filtering methods, which has a great impact on the de-noising precision. The filtering series methods in the time domain are mainly suitable for processing Gaussian random noises, which are rare in practice, and these methods are mostly used for image de-noising.
The other type of de-noising is processing the noise in the frequency domain. The discrete wavelet transform (DWT) method [27][28][29][30][31][32][33][34][35][36], ensemble empirical mode decomposition (EEMD) method [37][38][39][40], and fast Fourier transform (FFT) method [41][42][43] have been hot topics in processing traffic flow measurement noises in recent years. The DWT method combined with Daubechies 4 wavelet has been used to deal with traffic flow data, and an improvement in forecasting accuracy has been achieved [22]. However, different mother wavelets, threshold selection methods, and different decomposition levels achieve different de-noising effects [31]. The EEMD method is an extension algorithm for the empirical mode decomposition (EMD) method [44][45][46][47][48][49][50][51], which has no requirement for prior knowledge of transform basis functions and overcomes the mode mixing, false mode, and endpoint problems of the EMD method by taking advantage of the uniform frequency distribution of Gaussian white noise. It also has significant advantages in dealing with nonstationary and nonlinear data. However, the modal number should be determined first as it can affect the extraction performance of noise separation [48].
Compared to the DWT and EEMD methods, which have been widely used in de-noising researches [37][38][39][40]52,53], the FFT method has been less used. One of the main reasons is that it is difficult to select a proper cutoff frequency to distinguish a low-frequency part of useful signal and high-frequency information of noise. When the noise frequency is unknown, the quality of the de-noised outputs for a given data set mainly depends on the cutoff frequency. Thus, the FFT method de-noising application ranges can be extended using an appropriate cutoff frequency. In practical signal de-noising applications, the FFT method is commonly used to de-noise the noisy signal with known noise frequency or to separate the noisy information using fixed cutoff frequency [54][55][56]. However, knowing the noise frequency is difficult. There have been some studies on the determination of a proper cutoff frequency when noise frequency is unknown. One of the commonly used methods for selecting an appropriate cutoff frequency is harmonic analysis [57,58], but this method is based on the determination of how much data should be accepted as useful signals, and there are no strict criteria for deciding it; namely, the decision process can be tedious and time-consuming. Residual analysis can also be used to determine the cutoff frequency [57,59,60], but there is a premise assumption that the optimum cutoff frequency is significantly correlated to the sampling frequency. In addition, some adaptive methods have been used for determining optimal cutoff frequency in image processing [61,62] and other fields [63]. In the short-term traffic flow data prediction, an adaptive cutoff frequency selection approach for the FFT method is required to separate the noisy data from historical measurements to improve the accuracy of constructed assimilation models and assimilation forecasting results.
This paper proposes an adaptive cutoff frequency selection (A-CFS) method to de-noise historical measurements, which are further used to build assimilation models in an S-DA system. Considering the distribution characteristics of noise in the frequency domain, the A-CFS method can determine an appropriate cutoff frequency based on the cross-validation in the FFT method. Using an appropriate cutoff frequency ensures effective distinction and separation of the high-frequency noisy information from the low-frequency useful information. The wanted information can be obtained by subjecting the data without noises using FFT and its inverse method. The proposed A-CFS method can improve the accuracy of assimilation models built using noisy historical traffic flow measurements and further improve the accuracy of assimilation forecasting results with fast and simple characteristics. The method is verified by experiments of short-term traffic flow forecasting. The short-term traffic forecasting results of the proposed A-CFS method are compared with those of the DWT and EEMD methods to verify the effectiveness of the proposed method. The result shows that the proposed method performs better than the other two methods in terms of all evaluation metrics, demonstrating the effectiveness and good performance of the proposed method.
The remainder of the paper is organized as follows. Following the introduction section, the theoretical background of this study is briefly expressed. Then the proposed A-CFS approach in the FFT method is presented. Application experiments are displayed utilizing the method proposed in the previous section. Results analysis follows. Finally, the conclusions are made.

Sequential Data Assimilation System for Short-Term Traffic Flow Forecasting
As previously mentioned, an S-DA system has three main parts: assimilation models, including state models and measurement models, measurements, and sequential assimilation methods, as shown in Figure 1.

26
This section may be divided by subheadings. It should provide a concise and precise description 27 of the experimental results, their interpretation as well as the experimental conclusions that can be 28 drawn. 29 This section may be divided by subheadings. It should provide a concise and precise 30 description of the experimental results, their interpretation as well as the experimental 31 conclusions that can be drawn. The text continues here. 43

44
All figures and tables should be cited in the main text as Figure 1, Table 1, etc.  In this study, the traffic flow values in the current time interval [(k − 1)T, T] are treated as the measurements in the sequential data assimilation system. Recently, the VAR model has been widely used for short-term traffic flow prediction due to its advantage of considering the adjacent paths' traffic flow values in the previous time interval. The assimilation models can be expressed as follows [23]:

of 23
The parameters given in Equation (3) are defined as follows: where X k is state vector needed to be estimated during assimilation process; y k represents the measurements part; q c (k) and q a i (k) denote the traffic flow values of the current path and its adjacent paths in the time interval [(k − 1)T, kT], respectively; q c (k) and q a i (k) are their corresponding average values, which can be calculated by historical flow measurements corresponding to the same time interval in previous weeks; m represents the number of adjacent paths. The dynamic state model M k,k−1 is set to the identity matrix I. As a basic assimilation method in the S-DA system, the Kalman filter (KF) method is effective in both stationary and non-stationary conditions and is a well-known technique in a linear system to track state values over time [22]. Related study results show that the KF method is well-behaved in many short-term traffic flow forecasting researches [11,22]. It can estimate state vectors using real-time measurements through its forecast and update procedure. Also, efficient calculations and little storage requirements features make it more appropriate for short-term traffic flow forecasting [22]. Hence, the assimilation method in the S-DA system used in this study is the KF method. As the connection of assimilation models and measurements, assimilation method, the KF method [22], including forecast and update parts, is expressed as Forecast : Update : where P f k denotes the error covariance matrix of the state vector prediction values, and P a k is the error covariance matrix of the estimated state vector values. As stated above, R k denotes the error covariance matrix of the Gaussian random noise series of the observation equation, i.e., Equation (2).
It plays an important role in the calculation of the Kalman gain matrix K k in the KF method, which is crucial for balancing the weight between the state estimates and new measurements. Noises in historical traffic flow measurements certainty affect the specification of the observational operator H k and further reduce the accuracy of assimilation results through affecting the Kalman gain matrix. According to Equation (4), the observational operator H k is built using historical measurements. Therefore, the de-nosing processing of measurements used to build the observational operator H k before the short-term traffic flow forecasting is imperative.

Fast Fourier Transform Method
Fourier analysis is a common tool in signal processing [65]. It can be used to obtain all the harmonic components of a signal conveniently and effectively using the spectrum functions. The Fourier transformation is a basic part of the Fourier analysis that can transform the signal between the time and frequency domains. After the Fourier transformation, the time-domain signal becomes a superposition of multiple sinusoidal signals. By analyzing the frequency of the sine wave, the signal can be changed from the time domain to the frequency domain. In the frequency domain, signal characteristics that are not evident in the time domain can be seen clearly. Hence, performing Fourier transforms on signals is crucial to analyze their nature.
In practical applications, computer processing generally requires the discretization of signal information in the time and frequency domains. The Fourier transformation of a discrete periodic signal meets this requirement. To ensure its finiteness, the discrete Fourier transform (DFT) method is performed only on a discrete periodic signal in the time and frequency domains, and it is expressed as follows [65]: The corresponding inverse transformation is as follows: and V denotes the length of change interval; f (k) denotes the original signal, and F( j) denotes discrete Fourier transform; k represents the interval length of Fourier transform. Further details can be found in [66][67][68][69].
However, the DFT method has certain disadvantages, such as complicated computations, low efficiency, and large numbers of required calculations. The number of computations is approximately V 2 for multiplication and V(V − 1) for addition. If V is large, the number of calculations will be very large. Hence, a commonly used version of the DFT in numerical calculations is the fast Fourier transform (FFT) method [41], which uses the periodicity and symmetry of W jk V in the DFT method to improve operational efficiency. The FFT method is a simple, efficient method for computing the DFT. The relationships between the amount of computation and the number of calculation points for the DFT and FFT methods are presented in Figure 2. The FFT method is superior to the DFT method in terms of calculation efficiency, so the FFT method is selected to be used for further computations.   In the frequency domain, the useful information in the given data set occupies the lower end of the frequency spectrum and noisy information occupies the higher end of the frequency spectrum. Purer series can be obtained after cutting off certain high frequencies noisy information using the Fourier inversion method from the frequency domain into the time domain. Hence, the proper cutoff frequency has to be determined previously, as if the cutoff frequency is set too high, too much noisy information will be treated as a useful one. Conversely, if the cutoff frequency is set too low, useful information will be lost [70]. In this study, an A-CFS approach is proposed to determine the proper cutoff frequency for noises effectively separated in the FFT method.

Adaptive Cutoff Frequency Selection in Fast Fourier Transform Method
As mentioned in the previous section, how to effectively separate high-frequency information to remove noisy information from the measurements using the FFT method is crucial; thus, it should be further studied. As stated before, different cutoff frequency yields different de-noising accuracy. This can be explained by the following example. Consider the original traffic flow sequence data presented in Figure 3. It can be converted into the frequency domain using the FFT method. After separating the high-frequency part using different cutoff frequencies in the frequency domain, the separated noises and the remaining processed data can be obtained after signals are inverted back to the time domain. Before separating the high-frequency part in the frequency domain using the FFT method, the cutoff frequency should be defined. In this example, the following cutoff frequencies are used: Frequency1 = 2.8435 × 10 −5 Hz, Frequency2 = 8.5305 × 10 −5 Hz, Frequency3 = 1.4218 × 10 −4 Hz, and Frequency4 = 1.9905 × 10 −4 Hz. In the frequency domain, data with a frequency that is greater than the cutoff frequency are regarded as high-frequency information, i.e., as the noise that needs to be separated. The noises separated using the four cutoff frequencies are presented in Figure 4a       As shown in Figure 4, the traffic flow becomes smoother as the cutoff frequency decreases. The original measurement contains two clear peaks, one at about 07:00 and another at around 15:00. There are not two clear peaks in Figure 4e, but they are evident in Figure 4f-h. This indicates that under the first noise-separation frequency, a piece of effective information is treated as noise, and the remaining data are distorted. Thus, in the practical signal de-noising applications, the FFT method is usually used to de-noise the noisy signal with known noise frequency. However, it is difficult to know the noise frequency in advance. When the noise frequency is unknown, in the frequency domain, if the cutoff frequency is too high, too much noisy information will be treated as useful one; conversely, if the cutoff frequency is too low, a part of the useful information will be lost, as presented in Figure 4e. Therefore, an adaptive method for choosing an appropriate cutoff frequency in the FFT method is necessary.
Considering the noise distribution characteristics in the frequency domain, the A-CFS method, which uses cross-validation to select an appropriate cutoff frequency in the FFT method, is proposed in this work. The proposed method can effectively determine a proper cutoff frequency and filter out the high-frequency noisy information, following the basic principle of sufficient decomposition and low differences in variation tendency between the original data and processed de-noised data. The useful data without noises can be obtained using the FFT and its inverse method with fast, accurate, and simple characteristics.
The framework of the proposed method is shown in Figure 5, and it includes the following steps:  (1) Collect traffic flow data T_F (n, m) from the same days (for instance, consecutive Mondays) during m consecutive weeks. The data length of each day is n. The maximum signal recognition frequency is mf. It can be calculated by m f ≤ 0.5 × s f based on the Nyquist sampling theorem, where s f is the known signal sampling frequency. As the signals beyond the maximum signal recognition frequency mf are distorted, it will not be considered further.  Figure 3, in the frequency domain after applying the FFT method. The value of low_f is set to be 0.25× mf in further calculations. (5) Use the threshold value T to process the frequency-domain signal. The high-frequency noise whose frequency is higher than the threshold value T will be filtered out to obtain the de-noised frequency-domain signal.

Empirical Study Design
The traffic flow datasets used in the latter experiments were downloaded from the Highways England website (highwaysengland.co.uk). Traffic flow value refers to the number of traffic entities passing through a certain point, a certain section, or a certain lane of the road during a particular period of time. It is usually used to determine what types of traffic management measures should be taken. Thus, accurate forecasting of traffic flow plays a very important role in traffic engineering. The traffic flow data used in this study were of a sub-area of the highway near Birmingham, England (including a total of 514 paths), as shown in Figure 7a. Traffic flow data of each path in the period from Monday to Sunday were separately collected. As the mean traffic flow values were necessary for the assimilation model construction, as given in Equation (4), the data of each path contained eight days from a few consecutive weeks. The former seven days were used for the assimilation model construction in the S-DA system for short-term traffic flow prediction, and the data of the eighth day were regarded as true values and used to test the effectiveness of the proposed method. The time interval for data collection of each path was 15 min and to the assimilated frequent. Thus, the total number of observations used in the experiments was 2,763,264. It was assumed that the observations were not correlated. Traffic flow prediction results of each path from Monday to Sunday were separately acquired and analyzed. Furthermore, as the traffic flow in early mornings and late nights was small and of little concern to traffic management, only the prediction results from 6:00 to 21:00 were used.   First, a verification test was conducted to shown to test the availability of the proposed A-CFS method in the FFT method. For the sake of showing more details, the short-term traffic flow forecasting results of path 568 (LM932), which are shown in Figure 7b, were taken as a research object in the test. Assimilation models H were built using the de-noised historical measurements obtained by the FFT method. The cutoff frequencies were from low_fs to fs. The prediction results were obtained from the assimilation models. To evaluate and compare assimilation forecasting results, three measures, including the mean absolute error (MAE) [71,72], root mean square error (RMSE) [22,71] and mean absolute percentage (MAPE) [71], were used to evaluate and compare the forecasting accuracy. The effectiveness of the proposed A-CFS method was evaluated based on MAE, RMSE, and MAPE values, and a proper cutoff frequency was considered as the one that corresponded to the smaller values of the three used measures. For a real observation X i and the corresponding forecasted valueX i , MAE, RMSE, and MAPE values were calculated as follows: The smaller the values of MAE, RMSE, and MAPE were, the better the forecasting results were achieved.
Second, to verify the effectiveness of the proposed A-CFS approach in the FFT method, four different datasets were used to build the assimilation models H according to Equation (4). These datasets were raw traffic flow data and processed data obtained by successively adopting the proposed method in the FFT, DWT, and EEMD methods. The processed data obtained by the proposed method A-CFS was defined as F data.
In the DWT method, the signal can be decomposed into several levels. More details on wavelet decomposition can be found in [28]. To demonstrate different noise separation effects in different decomposition levels, the traffic flow measurements of path 568 (LM932) collected on 10 February 2014, which are shown in Figure 3, were examined to show the noise separation results over decomposition levels. Daubechies 4 was used as a mother wavelet since it has been commonly used [22]. The soft threshold function was used to obtain the de-noised signal by reconstructing the wavelet coefficients after threshold processing. The approximated data and noise result of the i-level decomposition are denoted as A i and D i , respectively. The noises separated from the traffic flow data, and a comparison of the de-noised approximated data and raw data are presented in Figure 8.   As shown in Figure 8, the noise and processed approximated data became increasingly smooth as the decomposition level increase. Compared to the processed approximated measurements shown in Figure 8g,h, the processed signals in Figure 8e,f retained more detail of the original data. Compared to the data with separated noise, as shown in Figure 8a-c, the noisy information that is shown in Figure 8a was stronger, which represented the highest degree of noise in all the separated noises. The noises in Figure 8d were too gentle and contained some useful measurement information. Hence, it is necessary to consider which separation scale should be chosen to decompose the original signal to achieve the optimal de-noising performance and improve the accuracy of the assimilation models and results. Based on many conducted experiments, the data obtained from two-level decomposition using the DWT method denoted as A 2 , was used in the latter study, as noises were not excessively separated, and forecasting accuracy was best than other decomposition levels.
The basic principle of the EEMD method is to decompose complex signals into a finite number of intrinsic mode functions (IMFs) and residual components. The core idea of the EEMD is to use the advantage of the white noise statistical characteristics. Namely, by adding the white noise to a useful signal, the characteristics of the signal endpoint will change, which helps to make the original signal remain continual at different scales. Besides, it can promote anti-aliasing decomposition [40]. The decomposed IMF components contain local characteristics of the original signals at different time scales. Each IMF can be processed using the Hilbert transformation method. The instantaneous frequency and amplitude of an IMF can be obtained. Thus, complete time-frequency distribution information of a complex signal can be obtained [44]. The advantage of the EEMD method is that it is suitable for nonlinear and nonstationary signals and can be performed based on the characteristics of the raw signals, so it represents an efficient adaptive time-frequency processing method. More details on the EEMD method can be found in [37][38][39][40].
The components of IMFs and residual of the traffic flow measurements of path 568 (LM932) collected on 10 February 2014, obtained by the EEMD are presented in Figure 9. The reconstructed data using a different number of IMFs are presented in Figure 10. As shown in Figures 9 and 10, the data reconstructed using the IMFs from IMF2 to IMF5 and residuals could reflect the trend of the original traffic flow data. However, with the increase in the amount of separated data, the rebuilt data became increasingly distorted. As shown in Figure 10c, the traffic flow data could even be negative. Therefore, it is important to select an appropriate number of IMFs used to de-noise data. The commonly used methods for this purpose are the correlational analysis method, adjacent signal standard deviation method, and continuous mean square deviation method [73]. In this study, a correlation coefficient method based on energy density and average period [73] is used. Data processed by the EEMD method are defined as E data.     From the above, the information on four datasets is listed in Table 1. Each of the four datasets consisted of 56-days traffic flow data, containing eight consecutive weeks from Mondays to Sundays. The raw 24-h traffic flow data were aggregated into 15-min intervals. Four H models built using four data sets are listed in Table 1. It should be noted that Model 1 was built just using raw data.

Results Analysis
To facilitate detailed comparison, the short-term traffic flow prediction of path 568(LM932), shown in Figure 7b, was used first to test the effectiveness of the proposed A-CFS method in the FFT method. The cutoff frequency obtained by the proposed A-CFS method in the FFT method is presented in Table 2. The MAE, RMSE, and MAPE values of the short-term traffic flow forecasting results for the cutoff frequencies from low_fs to fs on one workday (Monday) and one non-workday (Saturday) are presented in Figure 11a,b, respectively. The patterns were different on workdays and non-workdays, which is why the results for both days are presented. As shown in Table 2 and Figure 11, the cutoff frequencies obtained by the proposed A-CFS method corresponded to the smallest MAE, RMSE, and MAPE values. This result indicates the availability of the proposed A-CFS method in the FFT method to a certain degree.    Then, the short-term traffic flow prediction of paths in part areas I-IV (path 568(LM932), path 2091(AL2670), path 8655(LM168), and path 8314(LM188)) were used to illustrate different impacts of assimilation models on the assimilation prediction results. Four datasets were used to build four assimilation measurement models, and these models were then applied to the short-term traffic flow prediction. Without loss of generality, the predicted results obtained on one workday (Thursday) and one non-workday (Sunday) were analyzed.
The prediction performances of Model 1 (built using the raw history traffic flow data) and Model 2 (built using the de-noised historical traffic flow data obtained by the proposed A-CFS method) of the mentioned paths on Thursday and Sunday are presented in Figures 12 and 13, respectively. For comparative analysis of the experimental results, the true traffic flow values are also presented in Figures 12 and 13. The traffic flow data of each path contained the same day from eight consecutive weeks. Data of the former seven days were used for assimilation model construction in the S-DA system for the short-term traffic flow prediction. The data of the eighth day were regarded as true values and used to test the effectiveness of the proposed method. As shown in Figures 12 and 13, there was a consistent trend for all values obtained by two models; also, the prediction results obtained by Model 2 were much closer to the true values on both Thursday and Sunday than the ones obtained by Model 1. Moreover, during the peak hours on a workday, the accuracy of the traffic flow forecasting results obtained by Model 2 was higher than that obtained by Model 1. This result further indicates that the proposed A-CFS can separate noises from the data and improve the precision of assimilation models and forecasting results.
For the sake of more obvious comparisons and analyses, the performance measures of Model 1 and Model 2 are listed in Table 3. The three measure values of Models 3 and 4 are also presented in Table 3 to demonstrate the proposed method's effectiveness further. Consider the results on workday Thursday first. As presented in Table 3     To test the de-noising effect of the proposed A-CFS method further, the average values of the three performance measures of paths in areas I-IV, shown in Figure 7b-e, of Models 1 and 2 from Monday to Sunday are presented in Figure 14. As shown in Figure 14, the mean values of MAE, RMSE, and MAPE from Monday to Sunday obtained by Model 2 were smaller than those of Model 1. Moreover, for a better comparison, the mean MAE, RMSE, and MAPE values of Models 3 and 4 are also presented in  Table 4 and the distributions in Figure 14 indicate that the assimilation model built using the de-noised data obtained by the proposed method outperforms all the other models, which verifies the effectiveness of the proposed A-CFS method.
The average MAE, RMSE, and MAPE values of all the paths shown in Figure 7a of the four models are presented in Table 5. The relative improvements of mean MAE, RMSE, and MAPE values in percentage of Model 2 over the three other models are given in Table 6. Results in Table 5 show that compared to the model built using raw data, the models built using the de-noised data obtained by the proposed method were more precise. Also, among all the models built using the de-noised data, the smallest mean MAE and RMSE values from Monday to Sunday were obtained by Model 2. The average MAE, RMSE, and MAPE values of Model 2 were 29.64, 39.97, and 8.75, respectively. The values in Table 6 indicate that the proposed A-CFS method performed well in data de-noising. The average relative improvements of Model 2 over Models 3 and 4, which were built using the de-noise data obtained by the DWT and EEMD methods, in MAE were 3.47% and 4.25%, respectively; the relative improvements in RMSE were 5.36% and 3.02%, respectively; and lastly, the relative improvements in MAPE were 2.83% and 2.28%, respectively. The results also show that the assimilation model built using the de-noised data obtained by the proposed method performed the best.      Based on the results in Figures 11-14 and Tables 2-6, the proposed A-CFS method is effective in data de-noising and can solve the excessive de-noising problem in the DWT method and also in the EEMD method to a certain extent. Thus, the proposed method can be used to improve the accuracy of assimilation models and the short-term traffic flow prediction results.

Conclusions
This paper proposes an adaptive cutoff frequency selection (A-CFS) method for the FFT method to de-noise the historical measurements used to build the assimilation models in an S-DA system under the unknown noise frequency. The proposed method can effectively determine an appropriate cutoff frequency and distinguish the low-frequency part of useful signals from the high-frequency information caused by noises and ensure the quality of the de-noised outputs for a given dataset using the FFT method, which can further reduce influences of local noises on constructed assimilation models and improve the accuracy of assimilation results. Compared to the results of the DWT and EEMD methods, the short-term traffic flow forecasting results of the FFT method with the proposed A-CFS method are much more reliable. The proposed A-CFS method for the FFT method has also an advantage over the DWT and EEMD methods since the excessive de-noising problem is omitted. In terms of the MAE values, the average relative improvements of the assimilation model built using the proposed method are 19.26%, 3.47 %, and 4.25%, compared to the model built using raw data, DWT method, and EEMD method, respectively; from the RMSE perspective, the corresponding average relative improvements are 19.05%, 5.36%, and 3.02%, respectively; lastly, from the MAPE perspective, the corresponding average relative improvements are 18.88%, 2.83%, and 2.28%, respectively. In future work, the proposed method will be applied to and tested in other fields to expand its range of applications.