An Adaptive Cutoff Frequency Selection Approach for Fast Fourier Transform Method and Its Application into Short-Term Traffic Flow Forecasting

Wang, Runjie; Shi, Wenzhong; Liu, Xianglei; Li, Zhiyuan

doi:10.3390/ijgi9120731

Open AccessArticle

An Adaptive Cutoff Frequency Selection Approach for Fast Fourier Transform Method and Its Application into Short-Term Traffic Flow Forecasting

¹

College of Surveying and Geo-Informatics, Tongji University, Shanghai 200092, China

²

Department of Land Surveying and Geo-Informatics, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong 999077, China

³

School of Geomatics and Urban Spatial Informatics, Beijing University of Civil Engineering and Architecture, Beijing 102612, China

⁴

Key Laboratory of Mineral Resources, Institute of Geology and Geophysics, Chinese Academy of Sciences, Beijing 100029, China

^*

Author to whom correspondence should be addressed.

ISPRS Int. J. Geo-Inf. 2020, 9(12), 731; https://doi.org/10.3390/ijgi9120731

Submission received: 5 November 2020 / Revised: 20 November 2020 / Accepted: 4 December 2020 / Published: 7 December 2020

Download

Browse Figures

Versions Notes

Abstract

:

Historical measurements are usually used to build assimilation models in sequential data assimilation (S-DA) systems. However, they are always disturbed by local noises. Simultaneously, the accuracy of assimilation model construction and assimilation forecasting results will be affected. The fast Fourier transform (FFT) method can be used to acquire de-noised historical traffic flow measurements to reduce the influence of local noises on constructed assimilation models and improve the accuracy of assimilation results. In the practical signal de-noising applications, the FFT method is commonly used to de-noise the noisy signal with known noise frequency. However, knowing the noise frequency is difficult. Thus, a proper cutoff frequency should be chosen to separate high-frequency information caused by noises from the low-frequency part of useful signals under the unknown noise frequency. If the cutoff frequency is too high, too much noisy information will be treated as useful information. Conversely, if the cutoff frequency is too low, part of the useful information will be lost. To solve this problem, this paper proposes an adaptive cutoff frequency selection (A-CFS) method based on cross-validation. The proposed method can determine a proper cutoff frequency and ensure the quality of de-noised outputs for a given dataset using the FFT method without noise frequency information. Experimental results of real-world traffic flow data measurements in a sub-area of a highway near Birmingham, England, demonstrate the superior performance of the proposed A-CFS method in noisy information separation using the FFT method. The differences between true and predicted traffic flow values are evaluated using the mean absolute error (MAE), root mean square error (RMSE), and mean absolute percentage (MAPE) values. Compared to the results of the two commonly used de-noising methods, i.e., discrete wavelet transform (DWT) and ensemble empirical mode decomposition (EEMD) methods, the short-term traffic flow forecasting results of the proposed A-CFS method are much more reliable. In terms of the MAE value, the average relative improvements of the assimilation model built using the proposed method are 19.26%, 3.47%, and 4.25%, compared to the model built using raw data, DWT method, and EEMD method, respectively; the corresponding average relative improvements in RMSE are 19.05%, 5.36%, and 3.02%, respectively; lastly, the corresponding average relative improvements in MAPE are 18.88%, 2.83%, and 2.28%, respectively. The test results show that the proposed method is effective in separating noises from historical measurements and can improve the accuracy of assimilation model construction and assimilation forecasting results.

Keywords:

sequential data assimilation system; noises separation; fast Fourier transform method; cutoff frequency

1. Introduction

Data assimilation (DA) represents an important method in spatial science. Physical dynamic models and measurements are two fundamental approaches to acquire natural phenomena and laws in spatial science [1,2,3,4]. However, dynamic models and measurements have their own advantages and disadvantages. For instance, simulation of a dynamic model can continuously represent the characteristics of the system state vectors in space and time, but it is difficult to describe all of the characteristics of real state vectors accurately [5,6,7,8]. Measurements can represent real values of observed objects at the time and space of observation. However, it is difficult to obtain continuous measurements in time and space. Furthermore, different measurement methods have different measurement errors, which affect the evolution representation of various processes in the spatial system [6,7]. DA can estimate the state vectors by integrating the strengths of physical-model information and measurements while considering the data distribution in time and space as well as measurements and background field errors [9]. The fundamental idea of DA is to combine dynamic models and measurements so that they can mutually interact to achieve a more accurate estimation and prediction of the state vectors in spatial science. DA plays a significant role in meteorology, oceanography, hydrology, and land surface systems [1,2,3]. Recently, a DA-related system based on Bayesian theory has been used for short-term traffic state predictions, and good results were achieved [10,11]. Short-term traffic flow forecasting is a common research topic and is extremely important in many intelligent transportation systems, especially in dynamic traffic management systems [12,13,14,15]. Unlike the long-term traffic flow forecasting methods, where prediction intervals of traffic flow are usually hours, days, months, quarters, or even years, short-term traffic flow forecasting refers to predicting the traffic flow using the information collected in a short time interval, for instance, 1 min, 5 min, 10 min, 15 min, or 30 min [16,17,18,19,20]. This forecasting is widely used in traffic control and guidance.

DA systems can estimate the short-term traffic flow by integrating physical-model information and measurements while considering data distribution in both time and space, as well as measurements and background field errors [21]. A DA system mainly consists of three parts: assimilation models, including state models and measurement models, measurements, and assimilation methods. The mathematical expression of DA is as follows [21]:

X_{k} = M_{k, k - 1} X_{k - 1} + G_{k, k - 1} w_{k - 1},

(1)

y_{k} = H_{k} X_{k} + v_{k}

(2)

The state vector

X_{k}

in discrete-time index k is defined by the dynamic state equation, i.e., Equation (1), where

M_{k, k - 1}

denotes the dynamic state transition model. In Equation (2), which is an observation equation,

H_{k}

denotes the time-dependent observational operator that connects state

X_{k}

and measurements

y_{k}

,

w_{k}

and

v_{k}

denote the Gaussian random noise series with

w_{k} \sim N (0, Q_{k})

and

v_{k} \sim N (0, R_{k})

, which are completely independent of each other. In Equation (1),

G_{k, k - 1}

denotes a coefficient matrix. As the connection of assimilation models and measurements, assimilation methods can be generally classified into sequential assimilation methods and continuous assimilation methods. A DA system based on sequential assimilation methods is named the sequential data assimilation (S-DA) system.

However, it is difficult and challenging to predict short-term traffic flow accurately using an S-DA system because the short-term traffic flow has stochastic nature and is always corrupted by local noises [22]. Short-term traffic flow values can better preserve the underlying patterns of traffic-flow variation tendency and present a more real transition of traffic conditions than long-term traffic flow values. Assuming that traffic flow patterns change over a period of a week, changing patterns in historical measurements are commonly similar at the same characteristic days and at the same time intervals in continuous weeks. The regularity in historical measurements can be used to construct assimilation models in an S-DA system, such as the vector autoregressive (VAR) model [23]. But due to human or instrument errors, as well as stochastic features of short-term traffic flow values, such as undesirable traffic accident values, random changes can occur. These local noises in historical measurements usually make it difficult to abstract underlying patterns of traffic flow data for model construction precisely. Some methods have been proposed for dealing with these noises. One of them, including the filtering series methods, such as the mean filtering algorithm [24], non-local means filter algorithm [25], and median filter algorithm [26], directly smooth the noises in the time domain. However, it is often difficult to determine the window size of filtering methods, which has a great impact on the de-noising precision. The filtering series methods in the time domain are mainly suitable for processing Gaussian random noises, which are rare in practice, and these methods are mostly used for image de-noising.

The other type of de-noising is processing the noise in the frequency domain. The discrete wavelet transform (DWT) method [27,28,29,30,31,32,33,34,35,36], ensemble empirical mode decomposition (EEMD) method [37,38,39,40], and fast Fourier transform (FFT) method [41,42,43] have been hot topics in processing traffic flow measurement noises in recent years. The DWT method combined with Daubechies 4 wavelet has been used to deal with traffic flow data, and an improvement in forecasting accuracy has been achieved [22]. However, different mother wavelets, threshold selection methods, and different decomposition levels achieve different de-noising effects [31]. The EEMD method is an extension algorithm for the empirical mode decomposition (EMD) method [44,45,46,47,48,49,50,51], which has no requirement for prior knowledge of transform basis functions and overcomes the mode mixing, false mode, and endpoint problems of the EMD method by taking advantage of the uniform frequency distribution of Gaussian white noise. It also has significant advantages in dealing with nonstationary and nonlinear data. However, the modal number should be determined first as it can affect the extraction performance of noise separation [48].

Compared to the DWT and EEMD methods, which have been widely used in de-noising researches [37,38,39,40,52,53], the FFT method has been less used. One of the main reasons is that it is difficult to select a proper cutoff frequency to distinguish a low-frequency part of useful signal and high-frequency information of noise. When the noise frequency is unknown, the quality of the de-noised outputs for a given data set mainly depends on the cutoff frequency. Thus, the FFT method de-noising application ranges can be extended using an appropriate cutoff frequency. In practical signal de-noising applications, the FFT method is commonly used to de-noise the noisy signal with known noise frequency or to separate the noisy information using fixed cutoff frequency [54,55,56]. However, knowing the noise frequency is difficult. There have been some studies on the determination of a proper cutoff frequency when noise frequency is unknown. One of the commonly used methods for selecting an appropriate cutoff frequency is harmonic analysis [57,58], but this method is based on the determination of how much data should be accepted as useful signals, and there are no strict criteria for deciding it; namely, the decision process can be tedious and time-consuming. Residual analysis can also be used to determine the cutoff frequency [57,59,60], but there is a premise assumption that the optimum cutoff frequency is significantly correlated to the sampling frequency. In addition, some adaptive methods have been used for determining optimal cutoff frequency in image processing [61,62] and other fields [63]. In the short-term traffic flow data prediction, an adaptive cutoff frequency selection approach for the FFT method is required to separate the noisy data from historical measurements to improve the accuracy of constructed assimilation models and assimilation forecasting results.

This paper proposes an adaptive cutoff frequency selection (A-CFS) method to de-noise historical measurements, which are further used to build assimilation models in an S-DA system. Considering the distribution characteristics of noise in the frequency domain, the A-CFS method can determine an appropriate cutoff frequency based on the cross-validation in the FFT method. Using an appropriate cutoff frequency ensures effective distinction and separation of the high-frequency noisy information from the low-frequency useful information. The wanted information can be obtained by subjecting the data without noises using FFT and its inverse method. The proposed A-CFS method can improve the accuracy of assimilation models built using noisy historical traffic flow measurements and further improve the accuracy of assimilation forecasting results with fast and simple characteristics. The method is verified by experiments of short-term traffic flow forecasting. The short-term traffic forecasting results of the proposed A-CFS method are compared with those of the DWT and EEMD methods to verify the effectiveness of the proposed method. The result shows that the proposed method performs better than the other two methods in terms of all evaluation metrics, demonstrating the effectiveness and good performance of the proposed method.

The remainder of the paper is organized as follows. Following the introduction section, the theoretical background of this study is briefly expressed. Then the proposed A-CFS approach in the FFT method is presented. Application experiments are displayed utilizing the method proposed in the previous section. Results analysis follows. Finally, the conclusions are made.

2. Theoretical Background

2.1. Sequential Data Assimilation System for Short-Term Traffic Flow Forecasting

As previously mentioned, an S-DA system has three main parts: assimilation models, including state models and measurement models, measurements, and sequential assimilation methods, as shown in Figure 1.

In this study, the traffic flow values in the current time interval [

(k - 1) T, T

] are treated as the measurements in the sequential data assimilation system. Recently, the VAR model has been widely used for short-term traffic flow prediction due to its advantage of considering the adjacent paths’ traffic flow values in the previous time interval. The assimilation models can be expressed as follows [23]:

{\begin{cases} X_{k} = M_{k, k - 1} X_{k - 1} \\ y_{k} = H_{k} X_{k} \end{cases}

(3)

The parameters given in Equation (3) are defined as follows:

{\begin{cases} X_{k} = {[p a r a_1 (k), p a r a_2 (k), \dots, p a r a_n (k)]}^{T} \\ y_{k} = \frac{q_{c} (k + 1)}{\bar{q_{c}} (k + 1)} \\ M_{k, k - 1} = I \\ H_{k} = [D^{T} (k), D^{T} (k - 1), D^{T} (k - 2), \dots, D^{T} (k - n)] \\ D (k) = {[\begin{matrix} \frac{q_{c} (k)}{{\bar{q}}_{c} (k)} & \frac{q_{a_{i}} (k)}{{\bar{q}}_{a_{i}} (k)} \end{matrix}]}^{T} (i = 1, 2, 3, \dots, m) \end{cases}

(4)

where

X_{k}

is state vector needed to be estimated during assimilation process;

y_{k}

represents the measurements part;

q_{c} (k)

and

q_{a_{i}} (k)

denote the traffic flow values of the current path and its adjacent paths in the time interval [

(k - 1) T, k T

], respectively;

{\bar{q}}_{c} (k)

and

{\bar{q}}_{a_{i}} (k)

are their corresponding average values, which can be calculated by historical flow measurements corresponding to the same time interval in previous weeks;

m

represents the number of adjacent paths. The dynamic state model

M_{k, k - 1}

is set to the identity matrix

I

.

As a basic assimilation method in the S-DA system, the Kalman filter (KF) method is effective in both stationary and non-stationary conditions and is a well-known technique in a linear system to track state values over time [22]. Related study results show that the KF method is well-behaved in many short-term traffic flow forecasting researches [11,22]. It can estimate state vectors using real-time measurements through its forecast and update procedure. Also, efficient calculations and little storage requirements features make it more appropriate for short-term traffic flow forecasting [22]. Hence, the assimilation method in the S-DA system used in this study is the KF method. As the connection of assimilation models and measurements, assimilation method, the KF method [22], including forecast and update parts, is expressed as

F o r e c a s t : {\begin{cases} X_{k}^{f} = M_{k, k - 1} X_{k - 1}^{a} \\ P_{k}^{f} = M_{k, k - 1} P_{k - 1}^{a} M_{k, k - 1}^{T} \end{cases}

(5)

U p d a t e : {\begin{cases} K_{k} = P_{k}^{f} H_{k}^{T} {(H_{k} P_{k}^{f} H_{k}^{T} + R_{k})}^{- 1} \\ X_{k}^{a} = X_{k}^{f} + K_{k} (y_{k} - H_{k} X_{k}^{f}) \\ P_{k}^{a} = (I - K_{k} H_{k}) P_{k}^{f} \end{cases}

(6)

where

P_{k}^{f}

denotes the error covariance matrix of the state vector prediction values, and

P_{k}^{a}

is the error covariance matrix of the estimated state vector values. As stated above,

R_{k}

denotes the error covariance matrix of the Gaussian random noise series of the observation equation, i.e., Equation (2).

It plays an important role in the calculation of the Kalman gain matrix

K_{k}

in the KF method, which is crucial for balancing the weight between the state estimates and new measurements. Noises in historical traffic flow measurements certainty affect the specification of the observational operator

H_{k}

and further reduce the accuracy of assimilation results through affecting the Kalman gain matrix. According to Equation (4), the observational operator

H_{k}

is built using historical measurements. Therefore, the de-nosing processing of measurements used to build the observational operator

H_{k}

before the short-term traffic flow forecasting is imperative.

2.2. Fast Fourier Transform Method

Fourier analysis is a common tool in signal processing [65]. It can be used to obtain all the harmonic components of a signal conveniently and effectively using the spectrum functions. The Fourier transformation is a basic part of the Fourier analysis that can transform the signal between the time and frequency domains. After the Fourier transformation, the time-domain signal becomes a superposition of multiple sinusoidal signals. By analyzing the frequency of the sine wave, the signal can be changed from the time domain to the frequency domain. In the frequency domain, signal characteristics that are not evident in the time domain can be seen clearly. Hence, performing Fourier transforms on signals is crucial to analyze their nature.

In practical applications, computer processing generally requires the discretization of signal information in the time and frequency domains. The Fourier transformation of a discrete periodic signal meets this requirement. To ensure its finiteness, the discrete Fourier transform (DFT) method is performed only on a discrete periodic signal in the time and frequency domains, and it is expressed as follows [65]:

F (j) = \sum_{k = 0}^{V - 1} f (k) W_{V}^{j k} \begin{matrix} j = 0, 1, \end{matrix} \dots V - 1

(7)

The corresponding inverse transformation is as follows:

f (k) = \frac{1}{V} \sum_{j = 0}^{V - 1} F (j) W_{V}^{- j k} \begin{matrix} k = 0, 1, \end{matrix} \dots V - 1

(8)

where

f (k)

is discrete signals in the time domain.

F (j)

is discrete signals in the frequency domain.

V

is the number of sample points.

W_{V} = \exp (- k \frac{2 π}{V})

, and

V

denotes the length of change interval;

f (k)

denotes the original signal, and

F (j)

denotes discrete Fourier transform;

k

represents the interval length of Fourier transform. Further details can be found in [66,67,68,69].

However, the DFT method has certain disadvantages, such as complicated computations, low efficiency, and large numbers of required calculations. The number of computations is approximately

V^{2}

for multiplication and

V (V - 1)

for addition. If

V

is large, the number of calculations will be very large. Hence, a commonly used version of the DFT in numerical calculations is the fast Fourier transform (FFT) method [41], which uses the periodicity and symmetry of

W_{V}^{j k}

in the DFT method to improve operational efficiency. The FFT method is a simple, efficient method for computing the DFT. The relationships between the amount of computation and the number of calculation points for the DFT and FFT methods are presented in Figure 2. The FFT method is superior to the DFT method in terms of calculation efficiency, so the FFT method is selected to be used for further computations.

In the frequency domain, the useful information in the given data set occupies the lower end of the frequency spectrum and noisy information occupies the higher end of the frequency spectrum. Purer series can be obtained after cutting off certain high frequencies noisy information using the Fourier inversion method from the frequency domain into the time domain. Hence, the proper cutoff frequency has to be determined previously, as if the cutoff frequency is set too high, too much noisy information will be treated as a useful one. Conversely, if the cutoff frequency is set too low, useful information will be lost [70]. In this study, an A-CFS approach is proposed to determine the proper cutoff frequency for noises effectively separated in the FFT method.

3. Adaptive Cutoff Frequency Selection in Fast Fourier Transform Method

As mentioned in the previous section, how to effectively separate high-frequency information to remove noisy information from the measurements using the FFT method is crucial; thus, it should be further studied. As stated before, different cutoff frequency yields different de-noising accuracy. This can be explained by the following example. Consider the original traffic flow sequence data presented in Figure 3. It can be converted into the frequency domain using the FFT method. After separating the high-frequency part using different cutoff frequencies in the frequency domain, the separated noises and the remaining processed data can be obtained after signals are inverted back to the time domain. Before separating the high-frequency part in the frequency domain using the FFT method, the cutoff frequency should be defined. In this example, the following cutoff frequencies are used: Frequency1 = 2.8435 × 10⁻⁵ Hz, Frequency2 = 8.5305 × 10⁻⁵ Hz, Frequency3 = 1.4218 × 10⁻⁴ Hz, and Frequency4 = 1.9905 × 10⁻⁴ Hz. In the frequency domain, data with a frequency that is greater than the cutoff frequency are regarded as high-frequency information, i.e., as the noise that needs to be separated. The noises separated using the four cutoff frequencies are presented in Figure 4a–d and the corresponding processed data with the noises separated are presented in Figure 4e–h.

As shown in Figure 4, the traffic flow becomes smoother as the cutoff frequency decreases. The original measurement contains two clear peaks, one at about 07:00 and another at around 15:00. There are not two clear peaks in Figure 4e, but they are evident in Figure 4f–h. This indicates that under the first noise-separation frequency, a piece of effective information is treated as noise, and the remaining data are distorted. Thus, in the practical signal de-noising applications, the FFT method is usually used to de-noise the noisy signal with known noise frequency. However, it is difficult to know the noise frequency in advance. When the noise frequency is unknown, in the frequency domain, if the cutoff frequency is too high, too much noisy information will be treated as useful one; conversely, if the cutoff frequency is too low, a part of the useful information will be lost, as presented in Figure 4e. Therefore, an adaptive method for choosing an appropriate cutoff frequency in the FFT method is necessary.

Considering the noise distribution characteristics in the frequency domain, the A-CFS method, which uses cross-validation to select an appropriate cutoff frequency in the FFT method, is proposed in this work. The proposed method can effectively determine a proper cutoff frequency and filter out the high-frequency noisy information, following the basic principle of sufficient decomposition and low differences in variation tendency between the original data and processed de-noised data. The useful data without noises can be obtained using the FFT and its inverse method with fast, accurate, and simple characteristics.

The framework of the proposed method is shown in Figure 5, and it includes the following steps:

(1): Collect traffic flow data T_F (n, m) from the same days (for instance, consecutive Mondays) during m consecutive weeks. The data length of each day is n. The maximum signal recognition frequency is mf. It can be calculated by $m f \leq 0.5 \times s f$ based on the Nyquist sampling theorem, where $s f$ is the known signal sampling frequency. As the signals beyond the maximum signal recognition frequency mf are distorted, it will not be considered further.
(2): Get the median values of the traffic flow data Med_T_F (n, 1) from m days.
(3): Obtain the frequency domain signal F_T_F (n, m) of the original traffic flow data T_F (n, m). The length of the signal in the time and frequency domain is the same.
(4): Set the lower frequency low_f and the threshold value T from low_f to mf. The searching length is defined as $Δ f$ and $Δ f = \frac{m f}{n u m}$ , where $n u m$ is the number of discrete frequency points in the frequency domain. The reason for setting the lower frequency is that useful information is mainly focused within a certain lower frequency range, as shown in Figure 6. It presents the traffic flow signals of path 568 (LM932), shown in Figure 3, in the frequency domain after applying the FFT method. The value of low_f is set to be 0.25 $\times$ mf in further calculations.
(5): Use the threshold value T to process the frequency-domain signal. The high-frequency noise whose frequency is higher than the threshold value T will be filtered out to obtain the de-noised frequency-domain signal.
(6): Acquire the de-noised time-domain signal P_T_F (n, m) without noises using the inverse FFT method.
(7): Calculate the quadratic sum values E² (m, T) = (P_T_F (n, m)- Med_T_F (n, 1))².
(8): Find the smallest values E² (m, T) of each traffic flow data and take these m corresponding T values as the proper cutoff frequency to remove noises of each traffic flow dataset.

4. Empirical Study Design

The traffic flow datasets used in the latter experiments were downloaded from the Highways England website (highwaysengland.co.uk). Traffic flow value refers to the number of traffic entities passing through a certain point, a certain section, or a certain lane of the road during a particular period of time. It is usually used to determine what types of traffic management measures should be taken. Thus, accurate forecasting of traffic flow plays a very important role in traffic engineering. The traffic flow data used in this study were of a sub-area of the highway near Birmingham, England (including a total of 514 paths), as shown in Figure 7a. Traffic flow data of each path in the period from Monday to Sunday were separately collected. As the mean traffic flow values were necessary for the assimilation model construction, as given in Equation (4), the data of each path contained eight days from a few consecutive weeks. The former seven days were used for the assimilation model construction in the S-DA system for short-term traffic flow prediction, and the data of the eighth day were regarded as true values and used to test the effectiveness of the proposed method. The time interval for data collection of each path was 15 min and to the assimilated frequent. Thus, the total number of observations used in the experiments was 2,763,264. It was assumed that the observations were not correlated. Traffic flow prediction results of each path from Monday to Sunday were separately acquired and analyzed. Furthermore, as the traffic flow in early mornings and late nights was small and of little concern to traffic management, only the prediction results from 6:00 to 21:00 were used.

First, a verification test was conducted to shown to test the availability of the proposed A-CFS method in the FFT method. For the sake of showing more details, the short-term traffic flow forecasting results of path 568 (LM932), which are shown in Figure 7b, were taken as a research object in the test. Assimilation models

H

were built using the de-noised historical measurements obtained by the FFT method. The cutoff frequencies were from low_fs to fs. The prediction results were obtained from the assimilation models. To evaluate and compare assimilation forecasting results, three measures, including the mean absolute error (MAE) [71,72], root mean square error (RMSE) [22,71] and mean absolute percentage (MAPE) [71], were used to evaluate and compare the forecasting accuracy. The effectiveness of the proposed A-CFS method was evaluated based on MAE, RMSE, and MAPE values, and a proper cutoff frequency was considered as the one that corresponded to the smaller values of the three used measures. For a real observation

X_{i}

and the corresponding forecasted value

{\hat{X}}_{i}

, MAE, RMSE, and MAPE values were calculated as follows:

{\begin{cases} M A E = \frac{1}{n} \sum_{i = 1}^{n} | X_{i} - {\hat{X}}_{i} | \\ R M S E = \frac{1}{n} \sqrt{n \sum_{i = 1}^{n} {(X_{i} - {\hat{X}}_{i})}^{2}} \\ M A P E = \frac{1}{n} \sum_{i = 1}^{n} | \frac{X_{i} - {\hat{X}}_{i}}{X_{i}} | \times 100 % \end{cases}

(9)

The smaller the values of MAE, RMSE, and MAPE were, the better the forecasting results were achieved.

Second, to verify the effectiveness of the proposed A-CFS approach in the FFT method, four different datasets were used to build the assimilation models

H

according to Equation (4). These datasets were raw traffic flow data and processed data obtained by successively adopting the proposed method in the FFT, DWT, and EEMD methods. The processed data obtained by the proposed method A-CFS was defined as

F

data.

In the DWT method, the signal can be decomposed into several levels. More details on wavelet decomposition can be found in [28]. To demonstrate different noise separation effects in different decomposition levels, the traffic flow measurements of path 568 (LM932) collected on February 10, 2014, which are shown in Figure 3, were examined to show the noise separation results over decomposition levels. Daubechies 4 was used as a mother wavelet since it has been commonly used [22]. The soft threshold function was used to obtain the de-noised signal by reconstructing the wavelet coefficients after threshold processing. The approximated data and noise result of the i-level decomposition are denoted as

A_{i}

and

D_{i}

, respectively. The noises separated from the traffic flow data, and a comparison of the de-noised approximated data and raw data are presented in Figure 8.

As shown in Figure 8, the noise and processed approximated data became increasingly smooth as the decomposition level increase. Compared to the processed approximated measurements shown in Figure 8g,h, the processed signals in Figure 8e,f retained more detail of the original data. Compared to the data with separated noise, as shown in Figure 8a–c, the noisy information that is shown in Figure 8a was stronger, which represented the highest degree of noise in all the separated noises. The noises in Figure 8d were too gentle and contained some useful measurement information. Hence, it is necessary to consider which separation scale should be chosen to decompose the original signal to achieve the optimal de-noising performance and improve the accuracy of the assimilation models and results. Based on many conducted experiments, the data obtained from two-level decomposition using the DWT method denoted as

A_{2}

, was used in the latter study, as noises were not excessively separated, and forecasting accuracy was best than other decomposition levels.

The basic principle of the EEMD method is to decompose complex signals into a finite number of intrinsic mode functions (IMFs) and residual components. The core idea of the EEMD is to use the advantage of the white noise statistical characteristics. Namely, by adding the white noise to a useful signal, the characteristics of the signal endpoint will change, which helps to make the original signal remain continual at different scales. Besides, it can promote anti-aliasing decomposition [40]. The decomposed IMF components contain local characteristics of the original signals at different time scales. Each IMF can be processed using the Hilbert transformation method. The instantaneous frequency and amplitude of an IMF can be obtained. Thus, complete time-frequency distribution information of a complex signal can be obtained [44]. The advantage of the EEMD method is that it is suitable for nonlinear and nonstationary signals and can be performed based on the characteristics of the raw signals, so it represents an efficient adaptive time-frequency processing method. More details on the EEMD method can be found in [37,38,39,40].

The components of IMFs and residual of the traffic flow measurements of path 568 (LM932) collected on 10 February 2014, obtained by the EEMD are presented in Figure 9. The reconstructed data using a different number of IMFs are presented in Figure 10. As shown in Figure 9 and Figure 10, the data reconstructed using the IMFs from IMF2 to IMF5 and residuals could reflect the trend of the original traffic flow data. However, with the increase in the amount of separated data, the rebuilt data became increasingly distorted. As shown in Figure 10c, the traffic flow data could even be negative. Therefore, it is important to select an appropriate number of IMFs used to de-noise data. The commonly used methods for this purpose are the correlational analysis method, adjacent signal standard deviation method, and continuous mean square deviation method [73]. In this study, a correlation coefficient method based on energy density and average period [73] is used. Data processed by the EEMD method are defined as

E

data.

From the above, the information on four datasets is listed in Table 1. Each of the four datasets consisted of 56-days traffic flow data, containing eight consecutive weeks from Mondays to Sundays. The raw 24-h traffic flow data were aggregated into 15-min intervals. Four H models built using four data sets are listed in Table 1. It should be noted that Model 1 was built just using raw data.

5. Results Analysis

To facilitate detailed comparison, the short-term traffic flow prediction of path 568(LM932), shown in Figure 7b, was used first to test the effectiveness of the proposed A-CFS method in the FFT method. The cutoff frequency obtained by the proposed A-CFS method in the FFT method is presented in Table 2. The MAE, RMSE, and MAPE values of the short-term traffic flow forecasting results for the cutoff frequencies from low_fs to fs on one workday (Monday) and one non-workday (Saturday) are presented in Figure 11a,b, respectively. The patterns were different on workdays and non-workdays, which is why the results for both days are presented. As shown in Table 2 and Figure 11, the cutoff frequencies obtained by the proposed A-CFS method corresponded to the smallest MAE, RMSE, and MAPE values. This result indicates the availability of the proposed A-CFS method in the FFT method to a certain degree.

Then, the short-term traffic flow prediction of paths in part areas I–IV (path 568(LM932), path 2091(AL2670), path 8655(LM168), and path 8314(LM188)) were used to illustrate different impacts of assimilation models on the assimilation prediction results. Four datasets were used to build four assimilation measurement models, and these models were then applied to the short-term traffic flow prediction. Without loss of generality, the predicted results obtained on one workday (Thursday) and one non-workday (Sunday) were analyzed.

The prediction performances of Model 1 (built using the raw history traffic flow data) and Model 2 (built using the de-noised historical traffic flow data obtained by the proposed A-CFS method) of the mentioned paths on Thursday and Sunday are presented in Figure 12 and Figure 13, respectively. For comparative analysis of the experimental results, the true traffic flow values are also presented in Figure 12 and Figure 13. The traffic flow data of each path contained the same day from eight consecutive weeks. Data of the former seven days were used for assimilation model construction in the S-DA system for the short-term traffic flow prediction. The data of the eighth day were regarded as true values and used to test the effectiveness of the proposed method. As shown in Figure 12 and Figure 13, there was a consistent trend for all values obtained by two models; also, the prediction results obtained by Model 2 were much closer to the true values on both Thursday and Sunday than the ones obtained by Model 1. Moreover, during the peak hours on a workday, the accuracy of the traffic flow forecasting results obtained by Model 2 was higher than that obtained by Model 1. This result further indicates that the proposed A-CFS can separate noises from the data and improve the precision of assimilation models and forecasting results.

For the sake of more obvious comparisons and analyses, the performance measures of Model 1 and Model 2 are listed in Table 3. The three measure values of Models 3 and 4 are also presented in Table 3 to demonstrate the proposed method’s effectiveness further. Consider the results on workday Thursday first. As presented in Table 3, Model 2 outperformed the other models for all the paths. Measures of Model 2 for path 568 (LM932) were the smallest; namely, MAE was 56.20, RMSE was 73.64, and MAPE was 6.41. As for the results on non-workday Sunday, the prediction accuracy of Model 2 was still the best, and the MAE, RMSE, and MAPE values were 37.78, 46.57, and 6.48, respectively. For path 2091 (AL2670), the smallest MAE, RMSE, and MAPE values were 27.98, 38.47, and 9.27 on workday Thursday, and 12.06, 15.40, and 8.86 on non-workday Sunday, respectively. For path 8655 (LM168), the smallest MAE, RMSE, and MAPE values were 62.31, 78.94, and 6.72 on workday Thursday, and 44.23, 60.94, and 6.24 on non-workday Sunday, respectively. For path 8314 (LM188), the smallest MAE, RMSE, and MAPE values were 64.56, 89.30, and 6.04 on workday Thursday, and 44.42, 55.23, and 5.88 on non-workday Sunday, respectively. The obtained results follow the expectation that the proposed A-CFS method can achieve good performance in de-noising the historical traffic flow data.

To test the de-noising effect of the proposed A-CFS method further, the average values of the three performance measures of paths in areas I–IV, shown in Figure 7b–e, of Models 1 and 2 from Monday to Sunday are presented in Figure 14. As shown in Figure 14, the mean values of MAE, RMSE, and MAPE from Monday to Sunday obtained by Model 2 were smaller than those of Model 1. Moreover, for a better comparison, the mean MAE, RMSE, and MAPE values of Models 3 and 4 are also presented in Table 4. For instance, in Area I, MAE values of Model 2 were the smallest among all the models; the average MAE value was 34.59, which was smaller by 2.97 (from 37.56 to 34.59) compared to that of Model 1. The MAE value also decreased by 1.26 (from 35.85 to 34.59) and 0.57 (from 35.16 to 34.59) compared to the results of Models 3 and 4, respectively. The average RMSE value of Model 2 was 45.89, and it was the smallest RMSE value among all the models. The smallest average MAPE value of 5.96 was also achieved by Model 2. Similar results were achieved in other areas. The MAE, RMSE, and MAPE values in Table 4 and the distributions in Figure 14 indicate that the assimilation model built using the de-noised data obtained by the proposed method outperforms all the other models, which verifies the effectiveness of the proposed A-CFS method.

The average MAE, RMSE, and MAPE values of all the paths shown in Figure 7a of the four models are presented in Table 5. The relative improvements of mean MAE, RMSE, and MAPE values in percentage of Model 2 over the three other models are given in Table 6. Results in Table 5 show that compared to the model built using raw data, the models built using the de-noised data obtained by the proposed method were more precise. Also, among all the models built using the de-noised data, the smallest mean MAE and RMSE values from Monday to Sunday were obtained by Model 2. The average MAE, RMSE, and MAPE values of Model 2 were 29.64, 39.97, and 8.75, respectively. The values in Table 6 indicate that the proposed A-CFS method performed well in data de-noising. The average relative improvements of Model 2 over Models 3 and 4, which were built using the de-noise data obtained by the DWT and EEMD methods, in MAE were 3.47% and 4.25%, respectively; the relative improvements in RMSE were 5.36% and 3.02%, respectively; and lastly, the relative improvements in MAPE were 2.83% and 2.28%, respectively. The results also show that the assimilation model built using the de-noised data obtained by the proposed method performed the best.

Based on the results in Figure 11, Figure 12, Figure 13 and Figure 14 and Table 2, Table 3, Table 4, Table 5 and Table 6, the proposed A-CFS method is effective in data de-noising and can solve the excessive de-noising problem in the DWT method and also in the EEMD method to a certain extent. Thus, the proposed method can be used to improve the accuracy of assimilation models and the short-term traffic flow prediction results.

6. Conclusions

This paper proposes an adaptive cutoff frequency selection (A-CFS) method for the FFT method to de-noise the historical measurements used to build the assimilation models in an S-DA system under the unknown noise frequency. The proposed method can effectively determine an appropriate cutoff frequency and distinguish the low-frequency part of useful signals from the high-frequency information caused by noises and ensure the quality of the de-noised outputs for a given dataset using the FFT method, which can further reduce influences of local noises on constructed assimilation models and improve the accuracy of assimilation results. Compared to the results of the DWT and EEMD methods, the short-term traffic flow forecasting results of the FFT method with the proposed A-CFS method are much more reliable. The proposed A-CFS method for the FFT method has also an advantage over the DWT and EEMD methods since the excessive de-noising problem is omitted. In terms of the MAE values, the average relative improvements of the assimilation model built using the proposed method are 19.26%, 3.47 %, and 4.25%, compared to the model built using raw data, DWT method, and EEMD method, respectively; from the RMSE perspective, the corresponding average relative improvements are 19.05%, 5.36%, and 3.02%, respectively; lastly, from the MAPE perspective, the corresponding average relative improvements are 18.88%, 2.83%, and 2.28%, respectively. In future work, the proposed method will be applied to and tested in other fields to expand its range of applications.

Author Contributions

Conceptualization, Runjie Wang, Wenzhong Shi, and Xianglei Liu; methodology, Runjie Wang; formal analysis, Runjie Wang and Xianglei Liu; writing—original draft preparation, Runjie Wang; writing—review and editing, Runjie Wang, Wenzhong Shi, Xianglei Liu, and Zhiyuan Li All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Acknowledgments

The authors thank the Open Data of Highways England for providing the traffic flow data used in this study and LetPub (www.letpub.com) for its linguistic assistance during the preparation of this manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

References

Dee, D. Simplification of the Kalman filter for meteorological data assimilation. Q. J. Roy Meteor. Soc. 1991, 117, 365–384. [Google Scholar] [CrossRef]
Houser, P.; Shuttleworth, W.; Famiglietti, J.; Gupta, H.; Syed, K.; Goodrich, D. Integration of soil moisture remote sensing and hydrologic modeling using data assimilation. Water Resour. Res. 1989, 34, 3405–3420. [Google Scholar] [CrossRef] [Green Version]
McLaughlin, D. An integrated approach to hydrologic data assimilation: Interpolation, smoothing, and filtering. Adv. Water Resour. 2002, 25, 1275–1286. [Google Scholar] [CrossRef]
Brasseur, P. Ocean data assimilation using sequential methods based on the Kalman filter. In Ocean Weather Forecasting; Chassinget, E., Verron, J., Eds.; Springer: Berlin/Heidelberg, Germany, 2006; pp. 271–361. [Google Scholar]
Carrassi, A.; Vannitsem, S.; Nicolis, C. Model error and sequential data assimilation: A deterministic formulation. Q. J. Roy Meteor. Soc. 2008, 134, 1297–1313. [Google Scholar] [CrossRef] [Green Version]
Dee, D.; da Silva, A. Data assimilation in the presence of forecast bias. Q. J. Roy Meteor. Soc. 1998, 124, 269–295. [Google Scholar] [CrossRef]
Hamill, T.; Whitaker, J. Accounting for the error due to unresolved scales in ensemble data assimilation: A comparison of different approaches. Mon. Weather Rev. 2005, 133, 3132–3147. [Google Scholar] [CrossRef]
Houtekamer, P.; Herschel, L.; Mitchell, A. Model error representation in an operational ensemble kalman filter. Mon. Weather Rev. 2009, 137, 2126–2143. [Google Scholar] [CrossRef]
Reichle, R. Data assimilation methods in the earth science. Adv. Water Resour. 2008, 31, 1411–1418. [Google Scholar] [CrossRef]
Jin, S.; Wang, D.H.; Xu, C.; Ma, D. Short-term traffic safety forecasting using Gaussian mixture model and Kalman filter. J. Zhejiang Univ. Sci. A 2013, 4, 3–15. [Google Scholar] [CrossRef] [Green Version]
Guo, J.; Huang, W.; Williams, B.M. Adaptive Kalman filter approach for stochastic short-term traffic flow rate prediction and uncertainty quantification. Transp. Res. C-Emer. 2014, 43, 50–64. [Google Scholar] [CrossRef]
Weng, X.X.; Tan, G.X.; Yao, S.S.; Huang, Z. Traffic flow characteristics and short-term prediction model of urban intersection. JTTE 2006, 6, 103–107. [Google Scholar]
Qin, Z.H. The Urban Road Short-Term Traffic Flow Prediction Research. Appl. Mech. Mater. 2013, 423, 2954–2956. [Google Scholar] [CrossRef]
Gong, Y.S.; Yi, Z. Research of Short-Term Traffic Volume Prediction Based on KALMAN Filtering. In Proceedings of the 2013 6th International Conference on Intelligent Networks and Intelligent Systems (ICINIS), Shenyang, China, 1–3 November 2013; pp. 99–102. [Google Scholar]
Xu, Y.; Chen, H.; Kong, Q.; Zhai, X.; Liu, Y. Urban traffic flow prediction: A spatio-temporal variable selection-based approach. J. Adv. Transp. 2016, 50, 489–506. [Google Scholar] [CrossRef] [Green Version]
Voort, M.V.D.; Dougherty, M.; Watson, S. Combining kohonen maps with arima time series models to forecast traffic flow. Transp. Res. C-Emer. 1996, 4, 307–318. [Google Scholar] [CrossRef] [Green Version]
Smith, B.L.; Demetsky, M.J. Traffic Flow Forecasting: Comparison of Modeling Approaches. J. Transp. Eng. 1997, 123, 261–266. [Google Scholar] [CrossRef]
Smith, B.L.; Williams, B.M.; Oswald, R.K. Comparison of parametric and nonparametric models for traffic flow forecasting. Transp. Res. C-Emer. 2002, 10, 303–321. [Google Scholar] [CrossRef]
Yin, H.; Wang, S.; Xu, J.; Wong, C.K. Urban traffic flow prediction using a fuzzy-neural approach. Transp. Res. C-Emer. 2002, 10, 85–98. [Google Scholar] [CrossRef]
Stathopoulos, A.; Karlaftis, M.G. A multivariate state space approach for urban traffic flow modeling and prediction. Transp. Res. C-Emer. 2003, 11, 121–135. [Google Scholar] [CrossRef]
Smith, P.J.; Thornhill, G.D.; Dance, S.L.; Lawless, A.S.; Mason, D.C.; Nichols, N.K. Data assimilation for state and parameter estimation: Application to morphodynamic modelling. Q. J. Roy Meteor. Soc. 2013, 139, 314–327. [Google Scholar] [CrossRef]
Xie, Y.; Zhang, Y.; Ye, Z. Short-Term Traffic Volume Forecasting Using Kalman Filter with Discrete Wavelet Decomposition. Comput. Aided Civ. Inf. 2007, 22, 326–334. [Google Scholar] [CrossRef]
Shen, G.J.; Kong, X.J.; Chen, X. Short-term Traffic Flow Intelligent Hybrid Forecasting Model and Its Application. Control Eng. Appl. Inf. 2011, 13, 65–73. [Google Scholar]
Rakshit, S.; Ghosh, A.; Shankar, B.U. Fast mean filtering technique (FMFT). Pattern Recogn. 2007, 40, 890–897. [Google Scholar] [CrossRef]
Kindermann, S.; Osher, S.; Jones, P.W. Deblurring and Denoising of Images by Nonlocal Functionals. Multiscale Model Sim. 2005, 4, 1091–1115. [Google Scholar] [CrossRef]
Gupta, V.; Chaurasia, V.; Shandilya, M. Random-valued impulse noise removal using adaptive dual threshold median filter. J. Vis. Commun. Image R. 2015, 26, 296–304. [Google Scholar] [CrossRef]
Xu, Y.; Weaver, J.B.; Healy, D.M.; Lu, J. Wavelet transform domain filters: A spatially selective noise filtration technique. IEEE T Image Process 1994, 3, 747–758. [Google Scholar]
Zheng, T.; Girgis, A.A.; Makram, E.B. A hybrid wavelet-Kalman filter method for load forecasting. Electr. Power Syst. Res. 2000, 54, 11–17. [Google Scholar] [CrossRef]
Madheswari, K.; Venkateswaran, N. Swarm Intelligence based Optimization in Thermal Image Fusion using Dual Tree Discrete Wavelet Transform. Quant. InfraRed Therm. J. 2015, 14, 1–20. [Google Scholar]
Aravindan, T.E.; Seshasayanan, R. Medical image DENOISING scheme using discrete wavelet transform and optimization with different noises. Concurr. Comp. Pract. E 2019, 2019, 5540. [Google Scholar] [CrossRef]
Strmbergsson, D.; Marklund, P.; Berglund, K.; Saari, J.; Thomson, A. Mother wavelet selection in the discrete wavelet transform for condition monitoring of wind turbine drivetrain bearings. Wind Energy 2019, 22, 1–12. [Google Scholar] [CrossRef]
Enamamu, T.; Otebolaku, A.; Marchang, J.; Dany, J. Continuous m-Health Data Authentication Using Wavelet Decomposition for Feature Extraction. Sensors. 2020, 20, 5690. [Google Scholar] [CrossRef]
He, M.; Nian, Y.; Xu, L.; Qiao, L.; Wang, W. Adaptive Separation of Respiratory and Heartbeat Signals among Multiple People Based on Empirical Wavelet Transform Using UWB Radar. Sensors 2020, 20, 4913. [Google Scholar] [CrossRef] [PubMed]
Hong, Y.Y.; Cabatac, M.T.A.M. Fault Detection, Classification, and Location by Static Switch in Microgrids Using Wavelet Transform and Taguchi-Based Artificial Neural Network. IEEE Syst. J. 2020, 14, 2725–2735. [Google Scholar] [CrossRef]
Kirar, B.S.; Agrawal, D.K.; Kirar, S. Glaucoma Detection Using Image Channels and Discrete Wavelet Transform. IETE J. Res. 2020, 2020, 1–8. [Google Scholar] [CrossRef]
Lee, C.; Cheng, Y. Motor Fault Detection Using Wavelet Transform and Improved PSO-BP Neural Network. Processes 2020, 8, 1322. [Google Scholar] [CrossRef]
Zhang, D.; Cai, C.; Chen, S.; Ling, L. An improved genetic algorithm for optimizing ensemble empirical mode decomposition method. J. Syst. Sci. Syst. Eng. 2019, 7, 53–63. [Google Scholar] [CrossRef] [Green Version]
Fang, Y.; Guan, B.; Wu, S.; Heravi, S. Optimal forecast combination based on ensemble empirical mode decomposition for agricultural commodity futures prices. J. Forecast. 2020, 3, 1–10. [Google Scholar] [CrossRef]
Singh, G.; Kaur, M.; Singh, B. Detection of Epileptic Seizure EEG Signal Using Multiscale Entropies and Complete Ensemble Empirical Mode Decomposition. Wireless Pers. Commun. 2020, 2020, 1–20. [Google Scholar]
Yang, Y.; Wang, J. Forecasting wavelet neural hybrid network with financial ensemble empirical mode decomposition and mcid evaluation. Expert Syst. Appl. 2020, 166, 1. [Google Scholar] [CrossRef]
Alam, M.M.; Rehman, S.; Al-Hadhrami, L.M.; Meyer, J.P. Extraction of the inherent nature of wind speed using wavelets and FFT. Energy Sustain. Dev. 2014, 22, 34–47. [Google Scholar] [CrossRef] [Green Version]
Kumara, U.; Ridder, K.D. GARCH modelling in association with FFT–ARIMA to forecast ozone episode. Atmos. Environ. 2010, 44, 4252–4265. [Google Scholar] [CrossRef]
Li, L.; Cai, H.; Han, H.; Jiang, Q.; Ji, H. Adaptive short-time Fourier transform and synchrosqueezing transform for non-stationary signal separation. Signal Process. 2020, 166, 107231.1–107231.15. [Google Scholar] [CrossRef]
Huang, N.E.; Wu, M.L.C.; Long, S.R.; Shen, S.S.P.; Qu, W.; Gloersen, P.; Fan, K.L. A confidence limit for the empirical mode decomposition and Hilbert spectral analysis. P Roy Soc. A Math Phys. 2003, 459, 2317–2345. [Google Scholar] [CrossRef]
Gu, Y.S.; Wei, D.; Zhao, M.F. A New Intelligent Model for Short Time Traffic Flow Prediction via EMD and PSO–SVM. LNEE 2012, 113, 59–66. [Google Scholar]
Cheng, C.; Wei, L.Y. A novel time-series model based on empirical mode decomposition for forecasting TAIEX. Econ. Model. 2014, 36, 136–141. [Google Scholar] [CrossRef]
Ümit, C.B.; Şeyda, E. Improving forecasting accuracy of time series data using a new ARIMA-ANN hybrid method and empirical mode decomposition. Neurocomputing 2019, 361, 151–163. [Google Scholar]
Cheng, L.; Bao, Y.; Tang, L.; Di, H. Very-short-term load forecasting based on empirical mode decomposition and deep neural network. IEEJ T Electr. Electr. 2020, 15, 1–7. [Google Scholar] [CrossRef]
Qian, Y.; Yang, J.; Zhang, H.; Shen, C.; Wu, Y. An Hourly Prediction Model of Relativistic Electrons Based on Empirical Mode Decomposition. Space Weather. 2020, 18, 1–34. [Google Scholar] [CrossRef]
Wang, Z.X.; Zhao, Y.F.; He, L.Y. Forecasting the monthly iron ore import of China using a model combining empirical mode decomposition, non-linear autoregressive neural network, and autoregressive integrated moving average. Appl. Soft Comput. 2020, 94, 1–11. [Google Scholar] [CrossRef]
Ye, X.; Hu, Y.; Shen, J.; Feng, R.; Zhai, G. An improved empirical mode decomposition based on adaptive weighted rational quartic spline for rolling bearing fault diagnosis. IEEE Access 2020, 99, 1–11. [Google Scholar] [CrossRef]
Chen, B.; Qin, Q.; Zhang, X.G. Image De-Noising in Mixed Noises Based on Wavelet Transform. Adv. Mater. 2012, 562–564, 1861–1865. [Google Scholar] [CrossRef]
Aravindan, T.E.; Seshasayanan, R. Denoising Brain Images with the Aid of Discrete Wavelet Transform and Monarch Butterfly Optimization with Different Noises. J. Med. Syst 2018, 42, 207.1–207.13. [Google Scholar] [CrossRef] [PubMed]
Vago, J.L.; Vermeulen, H.C.; Verga, A. Fast Fourier transform based image compression algorithm optimized for speckle interferometer measurements. J. Nanotechnol. Eng. Med. 1997, 5, 1343–1350. [Google Scholar] [CrossRef]
Ganjali, M.R.; Faridbod, F.; Nasli-Esfahani, E.; Larijani, B.; Rashedi, H.; Norouzi, P. FFT Continuous Cyclic Voltammetry Triglyceride Dual Enzyme Biosensor Based on MWCNTs-CeO₂. Int. J. Electrochem. Sci 2010, 5, 1422–1433. [Google Scholar]
Zhao, Y.; Jia, X.; Zhang, Y.; Peng, X. Dynamic Analysis of an Offshore Platform with Compressor Packages-Application of the Substructure Method. J. Offshore Mech. Arct 2018, 140, 041303.1–041303.10. [Google Scholar] [CrossRef]
Yu, B.; Gabriel, D.; Noble, L.; An, K.N. Estimate of the Optimum Cutoff Frequency for the Butterworth Low-Pass Digital Filter. J. Appl. Biomech. 1999, 15, 319–329. [Google Scholar] [CrossRef]
Benson, R.F. Ordinary mode auroral kilometric radiation, with harmonics, observed by ISIS 1. Radio Sci. 2016, 19, 543–550. [Google Scholar] [CrossRef] [Green Version]
Nagano, A.; Komura, T.; Himeno, R.; Fukashiro, S. Optimal Digital Filter Cutoff Frequency of Jumping Kinematics Evaluated Through Computer Simulation. J. Sport Health Sci. 2003, 1, 196–201. [Google Scholar] [CrossRef]
Burkhart, T.A.; Dunning, C.E.; Andrews, D.M. Determining the optimal system-specific cut-off frequencies for filtering in-vitro upper extremity impact force and acceleration data by residual analysis. J. Biomech. 2011, 44, 2728–2731. [Google Scholar] [CrossRef]
Deng, Y.; He, G.; Kuppusamy, P.; Zweier, J.L. Deconvolution algorithm based on automatic cutoff frequency selection for EPR imaging. J. Magn. Reson. 2010, 50, 444–448. [Google Scholar] [CrossRef]
Mahyari, A.G.; Yazdi, M. Fusion of panchromatic and multispectral images using temporal Fourier transform. IET Image Process 2010, 4, 255–260. [Google Scholar] [CrossRef]
Li, D.; Zhang, J.; Yu, D.; Xu, R.; Lu, H.H.C.; Fernando, T. A Family of Binary Memristor-Based Low-Pass Filters With Controllable Cut-Off Frequency. IEEE Access 2020, 8, 60199–60209. [Google Scholar] [CrossRef]
Evensen, G. Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics. J. Geophys. Res.-Ocean. 1994, 99, 10143–10162. [Google Scholar] [CrossRef]
Bracewell, R.N. Fourier transform and its applications. IEEE T Power Electr. 2009, 11, 357. [Google Scholar] [CrossRef]
Tyagi, T.; Sumathi, P. Comprehensive Performance Evaluation of Computationally Efficient Discrete Fourier Transforms for Frequency Estimation. IEEE T Instrum. Meas. 2020, 69, 2155–2163. [Google Scholar] [CrossRef]
Pang, H.S.; Lim, J.S.; Lee, S. Discrete Fourier transform-based method for analysis of a vibrato tone. J. New Music Res. 2020, 4, 1–13. [Google Scholar]
Nam, S.R.; Kang, S.H.; Kang, S.H. Real-Time Estimation of Power System Frequency Using a Three-Level Discrete Fourier Transform Method. Energies 2014, 8, 79–93. [Google Scholar] [CrossRef]
Shlyakhtenko, P.; Kofnov, O. Double Two-Dimensional Discrete Fast Fourier Transform for Determining of Geometrical Parameters of Fibers and Textiles. Fibers 2013, 1, 36–46. [Google Scholar] [CrossRef]
Wood, G.A. Data smoothing and differentiation procedures in biomechanics. Exerc. Sport Sci. Rev. 1982, 10, 308–362. [Google Scholar] [CrossRef]
Guo, J.; Williams, B.; Smith, B. Data Collection Time Intervals for Stochastic Short-Term Traffic Flow Forecasting. Transp. Res. Rec. 2007, 2024, 18–26. [Google Scholar] [CrossRef]
Hou, X.Y.; Wang, Y.S.; Hu, S.Y. Short-term Traffic Flow Forecasting based on Two-tier K-nearest Neighbor Algorithm. Procedia Soc. Behav. Sci. 2013, 96, 2529–2536. [Google Scholar]
Yang, H. Empirical Mode Decomposition and Its Application in Water Acoustics Signal Processing. Ph.D. Thesis, Northwestern Polytechnical University, Xi’an, China, 2015. [Google Scholar]

Figure 1. The structure of a sequential data assimilation (S-DA) system [64].

Figure 2. Relationships between computation amount and calculation points for discrete Fourier transform (DFT) and fast Fourier transform (FFT) methods.

Figure 3. The original traffic flow sequence data.

Figure 4. Noises separated from the traffic flow measurements using the FFT method: (a)–(d) noises separated under four different cutoff frequencies; (e)–(h) original data and rebuilt data with noises separated under four different cutoff frequencies.

Figure 5. Flowchart of the adaptive cutoff frequency selection (A-CFS) algorithm.

Figure 6. The traffic flow signals in the frequency domain after applying the FFT method.

Figure 7. Study area: (a) paths near Birmingham; (b) part paths in Area I; (c) part paths in Area II; (d) part paths in Area III; (e) part paths in Area IV.

Figure 8. Noise separated from the traffic flow measurement data using the discrete wavelet transform (DWT) method: (a)–(d) noises separated from i-level decompositions; (e)–(h) original data and approximated data from i-level decompositions.

Figure 9. Decomposition of the original signal by the ensemble empirical mode decomposition (EEMD) method.

Figure 10. Comparison of raw traffic flow data and reconstructed data from (a) IMF2 to Residuals; (b) IMF3 to Residuals; (c) IMF4 to Residuals; (d) IMF5 to Residuals.

Figure 11. MAE and RMSE values for the cutoff frequencies from low_fs to fs: (a) values on a workday, Monday; (b) values on a non-workday, Saturday.

Figure 12. Traffic flow forecasting results on Thursday of: (a) path 568(LM932); (b) path 2091(AL2670); (c) path 8655(LM168); (d) path 8314 (LM188).

Figure 13. Traffic flow forecasting results on Sunday of: (a) path 568(LM932); (b) path 2091(AL2670); (c) path 8655(LM168); (d) path 8314 (LM188).

Figure 14. Average MAE, RMSE, and MAPE values of Models 1 and 2 from Monday to Sunday in (a) Area I; (b) Area II; (c) Area III; (d) Area IV.

Table 1. Data and model description.

Dataset	S1	S2	S3	S4
	Raw Data	F	A₂	E
Model H	Model 1	Model 2	Model 3	Model 4

Table 2. Cutoff frequencies obtained by the proposed A-CFS approach.

Day	Monday	Saturday
Cutoff frequency	0.000179	0.000145

Table 3. Measures of the four assimilation models.

		Thursday			Sunday
		MAE	RMSE	MAPE	MAE	RMSE	MAPE
path 568 (LM932)	Model 1	68.77	94.60	7.54	50.89	61.30	8.42
	Model 2	56.20*	73.64 *	6.41 *	37.78 *	46.57 *	6.48 *
	Model 3	59.12	76.37	6.66	41.77	52.34	7.00
	Model 4	57.21	75.49	6.49	38.05	47.09	6.56
path 2091 (AL2670)	Model 1	31.36	42.62	10.53	16.32	20.81	10.87
	Model 2	27.98 *	38.47 *	9.27*	12.06 *	15.40 *	8.86 *
	Model 3	30.17	39.64	10.02	12.60	16.05	9.14
	Model 4	29.27	39.90	9.63	12.38	15.52	8.97
path 8655 (LM168)	Model 1	86.37	117.07	9.22	56.81	87.37	7.79
	Model 2	62.31 *	78.94 *	6.72 *	44.23 *	60.94 *	6.24 *
	Model 3	67.78	91.24	7.36	47.79	70.10	6.63
	Model 4	65.88	87.31	7.10	44.84	63.00	6.50
path 8314 (LM188)	Model 1	80.85	104.48	7.45	55.87	72.53	6.77
	Model 2	64.56 *	89.30 *	6.04 *	44.42 *	55.23 *	5.88 *
	Model 3	71.46	99.14	6.56	48.92	64.08	6.21
	Model 4	66.91	92.70	6.21	46.56	56.49	6.12

* Shows the best performance.

Table 4. Average mean absolute error (MAE), root mean square error (RMSE), and mean absolute percentage (MAPE) values of paths in Areas I–IV from Monday to Sunday of the four assimilation models.

		MAE				RMSE				MAPE
		Model 1	Model 2	Model 3	Model 4	Model 1	Model 2	Model 3	Model 4	Model 1	Model 2	Model 3	Model 4
Area I	Mon.	33.91	33.01 *	34.48	33.16	48.71	47.71 *	49.82	48.00	5.74	5.66 *	6.03	5.71
	Tues.	44.80	42.65 *	43.80	42.87	62.87	57.67 *	61.09	57.90	7.08	6.80 *	7.05	6.82
	Wed.	43.28	40.25 *	40.65	40.33	58.19	54.39 *	54.59	54.45	6.66	6.27 *	6.42	6.35
	Thur.	38.58	36.10 *	36.64	37.68	52.78	48.58 *	50.92	50.34	5.82	5.61 *	5.81	5.78
	Fri.	42.06	36.11 *	38.32	36.39	54.14	46.64 *	49.75	47.27	5.90	5.34 *	5.71	5.38
	Sat.	33.19	30.01 *	32.08	31.56	42.69	36.70 *	40.27	37.31	6.52	6.21 *	6.46	6.25
	Sun.	27.13	23.96 *	25.00	24.16	33.89	29.59 *	30.96	29.66	6.48	5.86 *	6.08	5.98
	Mean	37.56	34.59 *	35.85	35.16	50.47	45.89 *	48.20	46.42	6.31	5.96 *	6.22	6.04
Area II	Mon.	58.91	46.10 *	47.51	47.04	78.89	62.64 *	64.66	63.36	10.18	7.82 *	7.93	7.93
	Tues.	62.47	54.89 *	54.93	55.65	87.66	76.74 *	78.02	77.69	10.75	9.21 *	9.34	9.36
	Wed.	59.55	51.46 *	52.62	51.87	79.87	70.24 *	72.63	70.74	9.62	8.38 *	8.50	8.44
	Thur.	71.93	57.72 *	59.82	59.95	95.76	75.38 *	81.03	78.39	11.18	9.25 *	9.42	9.31
	Fri.	53.44	44.68 *	46.47	44.82	69.35	58.21 *	60.40	58.31	8.69	7.28 *	7.65	7.32
	Sat.	30.18	24.07 *	24.77	24.75	37.33	29.45 *	30.70	30.14	8.88	6.70 *	6.81	6.76
	Sun.	36.02	27.56 *	28.40	27.62	45.68	33.54 *	34.60	33.71	9.68	7.54 *	7.88	7.57
	Mean	53.21	43.78 *	44.93	44.53	70.65	58.03 *	60.29	58.91	9.85	8.03 *	8.22	8.10
Area III	Mon.	81.00	55.78 *	69.34	58.31	131.82	90.20 *	117.88	91.11	13.47	9.77 *	11.47	9.95
	Tues.	64.97	46.94 *	53.44	48.58	100.76	67.74 *	84.26	70.10	11.31	8.28 *	9.14	8.33
	Wed.	56.67	42.77 *	45.62	43.13	73.06	55.24 *	60.14	57.29	9.55	7.22 *	7.46	7.27
	Thur.	71.58	51.01 *	55.96	53.19	101.73	70.61 *	81.62	73.61	11.47	8.22 *	8.99	8.49
	Fri.	69.34	48.44 *	51.46	49.60	97.22	63.93 *	76.17	68.16	10.68	7.44 *	8.27	7.57
	Sat.	33.81	24.18 *	27.13	24.81	41.71	29.36 *	32.83	29.70	9.73	6.73 *	7.21	6.90
	Sun.	47.99	34.32 *	38.55	36.23	74.82	49.20 *	60.75	52.12	10.74	7.52 *	8.31	8.01
	Mean	60.77	43.35 *	48.79	44.84	88.73	60.90 *	73.38	63.16	10.99	7.88 *	8.69	8.07
Area IV	Mon.	54.16	43.41 *	44.06	44.25	72.35	60.22 *	62.38	61.60	9.82	7.85 *	7.99	7.93
	Tues.	50.79	40.63 *	41.47	40.74	68.39	57.58 *	61.15	58.66	9.78	7.49 *	7.78	7.83
	Wed.	64.36	52.46 *	54.69	52.98	98.81	74.96 *	83.71	79.51	12.29	9.76 *	10.23	9.81
	Thur.	53.37	41.53 *	43.96	42.15	70.03	57.37 *	61.87	59.27	9.24	7.19 *	7.43	7.34
	Fri.	49.39	39.29 *	40.07	39.87	65.48	52.86 *	56.10	53.75	7.95	6.23 *	6.26	6.46
	Sat.	33.44	25.86 *	26.66	26.16	41.66	31.76 *	33.26	32.08	8.51	6.78 *	7.07	6.90
	Sun.	32.80	27.16 *	28.82	27.43	43.78	34.40 *	37.50	35.04	8.29	7.04 *	7.40	7.16
	Mean	48.33	38.62 *	39.96	39.08	65.79	52.74 *	56.57	54.27	9.41	7.48 *	7.74	7.63

* Shows the best performance.

Table 5. Average MAE, RMSE, and MAPE values of all paths from Monday to Sunday of the four assimilation models.

	MAE				RMSE				MAPE
	Model 1	Model 2	Model 3	Model 4	Model 1	Model 2	Model 3	Model 4	Model 1	Model 2	Model 3	Model 4
Mon.	39.40	32.25 *	33.71	33.56	53.62	44.57 *	47.39	45.48	10.91	8.89 *	9.08	8.95
Tues.	39.71	33.44 *	34.30	35.53	53.70	45.93 *	47.68	46.29	10.93	9.08 *	9.18	9.15
Wed.	40.98	34.19 *	35.08	34.96	55.50	46.86 *	48.76	47.29	10.80	8.94 *	9.23	9.07
Thur.	41.85	34.22 *	35.63	34.46	56.28	46.19 *	48.91	47.22	10.67	8.92 *	9.19	9.12
Fri.	40.53	31.50 *	33.18	32.81	53.58	42.27 *	45.29	43.97	9.97	7.97 *	8.39	8.16
Sat.	26.97	21.11 *	21.71	22.15	34.83	26.68 *	28.16	28.88	10.45	8.12 *	8.23	8.28
Sun.	26.68	20.80 *	21.47	22.71	35.84	27.24 *	29.14	28.31	11.74	9.31 *	9.72	9.97
Mean	36.59	29.64 *	30.73	30.88	49.05	39.97 *	42.19	41.06	10.78	8.75 *	9.00	8.96

* Shows the best performance.

Table 6. Relative improvements of Model 2 over the other three models from Monday to Sunday (%).

	MAE			RMSE			MAPE
	Model 1	Model 3	Model 4	Model 1	Model 3	Model 4	Model 1	Model 3	Model 4
Mon.	18.15	4.33	3.90	16.88	5.95	2.00	18.52	2.09	0.67
Tues.	15.79	2.51	5.88	14.47	3.67	0.78	16.93	1.09	0.77
Wed.	16.57	2.54	2.20	15.57	3.90	0.91	17.22	3.14	1.43
Thur.	18.23	3.96	0.70	17.93	5.56	2.18	16.40	2.94	2.19
Fri.	22.28	5.06	3.99	21.11	6.67	3.87	20.06	5.01	2.33
Sat.	21.73	2.76	4.70	23.40	5.26	7.62	22.30	1.34	1.93
Sun.	22.04	3.12	8.41	24.00	6.52	3.78	20.70	4.22	6.62
Mean	19.26	3.47	4.25	19.05	5.36	3.02	18.88	2.83	2.28

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, R.; Shi, W.; Liu, X.; Li, Z. An Adaptive Cutoff Frequency Selection Approach for Fast Fourier Transform Method and Its Application into Short-Term Traffic Flow Forecasting. ISPRS Int. J. Geo-Inf. 2020, 9, 731. https://doi.org/10.3390/ijgi9120731

AMA Style

Wang R, Shi W, Liu X, Li Z. An Adaptive Cutoff Frequency Selection Approach for Fast Fourier Transform Method and Its Application into Short-Term Traffic Flow Forecasting. ISPRS International Journal of Geo-Information. 2020; 9(12):731. https://doi.org/10.3390/ijgi9120731

Chicago/Turabian Style

Wang, Runjie, Wenzhong Shi, Xianglei Liu, and Zhiyuan Li. 2020. "An Adaptive Cutoff Frequency Selection Approach for Fast Fourier Transform Method and Its Application into Short-Term Traffic Flow Forecasting" ISPRS International Journal of Geo-Information 9, no. 12: 731. https://doi.org/10.3390/ijgi9120731

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

An Adaptive Cutoff Frequency Selection Approach for Fast Fourier Transform Method and Its Application into Short-Term Traffic Flow Forecasting

Abstract

1. Introduction

2. Theoretical Background

2.1. Sequential Data Assimilation System for Short-Term Traffic Flow Forecasting

2.2. Fast Fourier Transform Method

3. Adaptive Cutoff Frequency Selection in Fast Fourier Transform Method

4. Empirical Study Design

5. Results Analysis

6. Conclusions

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI