Noise Reduction for Nonlinear Nonstationary Time Series Data using Averaging Intrinsic Mode Function

A novel noise filtering algorithm based on averaging Intrinsic Mode Function (aIMF), which is a derivation of Empirical Mode Decomposition (EMD), is proposed to remove white-Gaussian noise of foreign currency exchange rates that are nonlinear nonstationary times series signals. Noise patterns with different amplitudes and frequencies were randomly mixed into the five exchange rates. A number of filters, namely; Extended Kalman Filter (EKF), Wavelet Transform (WT), Particle Filter (PF) and the averaging Intrinsic Mode Function (aIMF) algorithm were used to compare filtering and smoothing performance. The aIMF algorithm demonstrated high noise reduction among the performance of these filters.


Introduction
Kalmam Filter (KF) was conceptualised for use in a linear system.In a nonlinear filter, Extended KF (EKF) requires Jacobian mappings, which can be computationally intensive if the vector measurement is high.However, limitations in computer processing may be a problem [1,2].In brief, an EKF can estimate a highly non-stationary data series, with a known state space model incorporated along with EKF [3].
Sanjeev et al. explained that Particle Filter (PF) has received much attention in various fields over the past decades [4].The basic principle of PF is to use a set of weighted samples, also known as particles, to approximate the posterior probability of a time-varying signal of interest, given related observations.PFs generalize traditional KFs and can be applied to nonlinear and non-Gaussian state-space models.Similar to EKF, the PF algorithm is a two-state approach, i.e., prediction and correction; and is a technique for implementing recursive Bayesian filters by Monte Carlo sampling.In summary, PF represents the posterior density using a set of random particles with associated weights and requires a large number of particles.The most common applications of PFs are in the areas of image processing and segmentation, model selection, tracking and navigation, channel estimation, blind equalization, positioning in wireless networks, biochemistry, economics, finance, geosciences, immunology, materials science, pharmacology, and toxicology.Theoretically, the advantages of PF over EKF are in the representation of nonlinear functions because optimal estimation also uses nonlinear non-Gaussian state-space models [5][6][7].
In 1996, Huang and Okine [8] developed a posteriori algorithm for analysing nonlinear and nonstationary datasets using EMD, which is known as the Hilbert-Huang transform (HHT).It was applied in many applications such as bioinformatics [6,7], signal processing [5,8,9], geophysics [10,11], and finance [8,9,12].It was claimed that the HHT model could replace the narrow-band filter technique proposed by Hilbert-Gabor, when Fourier analysis was a vital tool for signal detections in the early days.EMD is based on the local time scale of the signal, which is decomposed using a sifting process [13,14].The principle of Wavelet Transform (WT) is that an input signal is split into various small waves, which are compact or are functions with finite length.In recent years, a few studies have used a method that combines EMD and WT to reduce nonlinear, nonstationary signals such as electrocardiography (ECG) signals.However, there is no end solution to determine the stop criteria of the EMD process.A comparison of WT and HHT shows that both have similar time and frequency distributions using amplitude as the common axis, where the analytical results for WT indicate a number of deficiencies such as harmonics and small spikes in the frequency scale.Thus, WT may not be suitable for the analysis and capture of large volumes of data [15].This paper presents a new novel digital filter termed -aIMF‖ which is a derivative of EMD.The aIMF algorithm is used to reduce a noise associated with nonlinear nonstationary time series data sets (exchange rates).To verify the performance of the proposed aIMF digital filter, we compare it with WT, EKF and PF.The result shows the aIMF algorithm outperforms those mentioned filters.Section 2 presents theoretical considerations of EMD, aIMF, WT, EKF and PF.The simulation and results including robustness test are presented in Section 3, whereas conclusion and discussion are in Section 4.

Theoretical Considerations of the Digital Filter
Referring to Section 1, we propose theoretical considerations of existing algorithms using for filtering nonlinear nonstationary times series data; and those are, WT, EKF, PF, and our proposed aIMF, which is a derivative of EMD .

EMD Algorithm
The key part of the HHT algorithm is EMD because any complex dataset can be decomposed into a finite that admits a well-behaved Hilbert transform.Because the decomposition is based on the local characteristic time scale of the data, it is applicable to nonlinear and nonstationary processes [8].During signal processing, high frequency noise from the input data may be considered as different simple intrinsic mode oscillations [8].In terms of signal processing, the EMD algorithm is viewed as an a posteriori method based on adaptive characteristic scale separation.This process is useful when the input signal oscillation is nonlinear and/or nonstationary.EMD is not suitable for sampling using a fixed time slot in the time series because the local mean of a signal is defined by enveloping without resorting to any time scale.Technically, EMD employs a sifting process and a cubic spline technique for smoothing and filtering a signal [8].The cubic spline interpolation is applied as a two-sided filter, which improves the confidence interval of the dataset distribution.
HHT is designed to decompose nonlinear and nonstationary signals, especially those with high volatility and fast frequency changes.The signal is adaptively decomposed into components and, after the decomposition of each step, the transformed signal is known as an Intrinsic Mode Function (IMF), which includes the different time scales intrinsic to the signal.IMF is a mono-component and can be transformed into an instantaneous frequency.IMF signals have also been defined as a function with a zero mean and with as many zero crossings as maxima or minima.The HHT process is divided into two parts: EMD and Hilbert Spectral Analysis (HSA).Figure 1 shows the EMD mechanism using a sifting process.It starts by identifying the maxima and minima points, extracting them, interpolating the maxima and minima envelopes, and computing the means of the local envelope.A nonlinear and nonstationary time series dataset is denoted as x i (t).The IMF, c i (t) is defined under the conditions that: (i) The numbers of extrema (maxima plus minima) and zero crossings in the entire data series must either be equal to or differ by at most one; and (ii) At any point, the mean value of the envelope defined by the local maxima and that defined by the local minima, must be zero [10].These conditions are met and where nU i (t), nL i (t) and nZ i (t), are the values of the maxima (upper peak), minima (lower peak) and zero crossing, of the EMD respectively.The EMD algorithm can be represented as follows.
(1) When constructing the upper and lower envelopes, we calculate the parabola coefficients of ax 2 + bx + c using x(k − 1), x(k), (k + 1); (2) If the second-degree coefficient, a, equals zero, is x(k) is certainly not an extremum so the sliding window moves further on a discrete value of x(k); (3) If the first-degree coefficient, b, equals zero, x(k) is an extremum, either a maximum or minimum, depending on the sign.We then calculate the top of this parabola by introducing ; similarly, applying to the bottom part of the parabolic curves; (4) Repeat ( 2) and (3) and stop after executing x(k + n).
Step 2: Average the maxima and minima in order of the time series, which is represented as Step 3: Subtract the x i (t) from the average of the local extrema (maxima and minima) m i (t) in order of the time series, and the new decomposed signal is Step 4: Repeat steps (i) through (iii) k times until 1 () k ht equals c 1 (t).Using Equation (5) where , we designate c 1 (t) as the first IMF.
Step 5: Find other IMFs by calculating the first residual, which is given by We derive 234 ( ), ( ), ( ),..., ( ) IMFs are narrowed band, zero-mean signals and the signal is decomposed into k IMFs by EMD, so each IMF is located in lower frequency regions in the time-frequency domain than the lagged signal.EMD can act as a dyadic filter bank for noise [16] and can be expressed as follows.
Since HHT has formed forms another function, HSA, it can be used as a tool for the time-frequency analysis of nonlinear and nonstationary datasets to present the results as a time-frequency-energy distribution [12].For a comprehensive explanation of the Hilbert transform, refer to Cohen [16].Given a real signal , the analytic signal is defined as (10) where is the Hilbert transform defined using the Cauchy principle value denoted by p.v. of such that We can define the complex signals, amplitude A(t) and phase as follows [8] ( where denotes real part of the complex quantity. Using Equation (14) and performing the Hilbert transform on each IMF c n (t), the analytical data can be expressed in terms of the Hilbert amplitude and instantaneous frequency, as follows: As the HHT model comprises two main parts: EMD and HSA, the EMD process under a cubic spline and sifting technique decomposes the original signals into IMF.HSA calculates the instantaneous frequency using the Hilbert transform and analyses the entire time-varying instantaneous spectrum.It is crucial to mention that the analysis of the Hilbert spectrum is conducted to view the spectrum after the EMD process is finished.

Creating the aIMF Algorithm
Kaslovsky and Meyer [17] explained that a meaningful instantaneous frequency (IF) decomposed by the EMD algorithm must be nearly monochromatic, of which is a condition that is not guaranteed by the algorithm and fails to be met clearly when the signals is corrupted by noise.Several reports demonstrated that the EMD performance is likely to be sensitive and produces a large quantity of noise [12,18].Moreover, the accuracy of HHT analysis suffers from several mathematical and numerical effects that require further studies.This is mainly because HHT signals emerged from EMD algorithm are not shift-invariant in times (stationary) and is likely be full narrow band [18].To reduce noise, one of the solutions was to normalise the analytic signal prior to Hilbert transform [8,19].There were many studies to improve the HHT algorithms, and those at least are; adoption of wavelet packet transform as the pre-processes to decompose the signals into a set of narrow-band signals prior to the application of the EMD [20,21], and an ensemble empirical mode decomposition (EEMD) which consists of shifting an ensemble of white-added noise signals and treats the mean as a final true result, using a windowed average.One of the publications of the EEMD was used to add noise to provide stoppage criteria for the EMD process [9].Kaiser [22] mentioned that Fourier spectral analysis and its derivatives such as WT encountered with a limited time window width by sliding a predetermined window(s) along the time axis.Moreover, there is a trade-off between the time consumed in the window width and the frequency resolution, and this phenomenon has been considered by the uncertainty principle Heisenberg [21].In the WT, the window width must be preselected and it is known as the primary or mother wavelet which is a prototype that provides less flexibility when handling datasets where the mean and variances are highly volatile.
It is imperative to restate that two conditions of EMD process defined in Section 2.1.However, with more iteration, the more residual r n (t) becomes either over-distorted or a monochromatic function from which no further IMF can be decomposed [11].The other study confirmed that the real complex quantity of all IMFs decomposed i.e., IMF1 to IMF7, which are in the amplitude-frequency domain, shifted to the lower region [19].
To reduce noise produced in the EMD process, we propose to extract noise spreading to all of the spectrums by averaging all IMFs and subtract them from the original signals in which is represented by 1 () () Next, we subtract the averaged IMF in Equation ( 16), c a (t), from the original nonlinear, nonstationary times series datasets, x n (t).Thus, a new digital filter is created, which is given by aIMF where y n (t) is a function of the aIMF algorithm.
In Figure 2, the aIMF process starts by inputting the nonlinear, nonstationary times series data, Euro and US dollars (EUR-USD), exchange rates into the EMD process.The next block uses the results from the EMD process whereas the aIMF algorithm begins in the third block where all of the IMFs are averaged and the averaged IMFs are subtracted from the original datasets.Referring to Figure 2, the proposed algorithm aIMF has an advantage which provides stoppage criteria once the two conditions of the EMD process defined in Section 2.1 are accomplished.This is because with more iteration, the more residual r n (t) becomes either over-distorted or a monochromatic function from which no further IMF can be decomposed.

Theoretical Considerations of the WT Algorithm
Helong et al. [23] applied the WT of a signal , which was early defined in [21] as follows where WT represents the calculated coefficients, and are the translation parameters and scale, respectively, is the transforming function (mother wavelet) and the bar over indicates its complex conjugate.A wavelet is classified as an adaptive algorithm, which is used in many fields such as astronomy, acoustics, nuclear engineering, signal processing and the prediction of earthquakes by solving partial optimized equations and reducing the random noise [15].A factorization of f at different resolution levels is defined by where represents the information in the signal on the coarsest level, is the scaling function and represents the details (wavelet coefficients) at the different scales necessary to reconstruct the function at the fine scale 0, at which the wavelet and scaling functions are compactly supported.The next step is to introduce the thresholding technique that is used to remove noise from each local set of , which are normally affected at different levels of the scale j.This is given by Li [24] where is the indicator and is the thresholding value.To find the thresholding value , we introduce factorizing and we obtain , which approximates .The error (risk) between and its approximation is given by In terms of the wavelet coefficients under Parseval's identity [24], the transform shown in Equation ( 20) can be expressed as Applying Equations ( 19) and ( 21) with respect to any resolution level, j = 1, 2, ..., J, we use Stein's principle [25] to minimize the risk in Equation (21).The thresholding value is finally obtained as )

Theoretical Considerations of the EKF Algorithm
In a nonlinear system, however, KF can be enhanced using a Taylor series to expand the state Equation ( 23) and output Equation ( 24) around a nominal state, known as a linearized KF.A summary of the linearized KF algorithm is (23) (24) where is the state of the system, denoted for the next time series of the original datasets x k ; k is the time index; u k is the driving function that may call a signal control or distribution function; is a noise, independent and identically distributed (i.i.d.) N(0,Q), where Q is the covariance (matrix) of the state; and is another noise, i.i.d.N(0,R), where R is the covariance (matrix) of the measurement of noise; is the measured output; and f(.) and h(.) are the state equation and output equation, respectively.The state and output functions in the case are nonlinear functions.Thus, Equations ( 27) and ( 28) are nominal states that are known (predicted) ahead of time, which are represented by where is the nominal state of the system and is the nominal measured output state.
During each step, we compute the partial derivative matrices of A k and C k with respect to x k , and we obtain in the following equations (27) (28) where A and C are matrices.Next, we define as the difference between the actual measurement and the nominal measurement , which is given by In this state, the following linearized KF equations can be executed as follows: (30 (32) where K k is the Kalman gain, P k is the covariance of the error of the estimation and I is the identity matrix.
In the linearized KF, there is a limitation that the nominal state must be set prior the execution.EKF then assumes that equals in the bootstrapping approach to the state equation.Thus, Equations ( 27) and ( 28) are subsequently changed to In Equation ( 31 In the initial state where k = 0, is predetermined using its means and Q k and R k are relevant to .In the execution mode, the measurement update (output state) adjusts the projected estimate based on an actual measurement at that time.It should be noted that EKF applies Equations (30), (31) and (37) to update the prediction mode after the state is changed periodically and the Kalman gain K k determines how the observer responds to the difference between its estimated output and the noisy measurement.
To simplify this, we can present EKF as the system block diagram shown in Figure 3.

Theoretical Considerations of the PF Algorithm
The theoretical considerations of the PF model start by considering a hidden Markov model (HMM) that observes the outputs x i indirectly using state y i , and we can specify a simple model as follows The goal is to estimate x k , which is equivalent to x i the original datasets, given all observations up to point (y 1:n ).Alternatively, we need to find the posterior distribution of .Using Bayes, we end up with two steps as follows: (i) Update step: (ii) Prediction step: We often find that these distributions were intractable, especially the nonlinear and Gaussian model in closed form.The solution is to approximate the distributions using a large number of samples (particles).To address the problems in Equations ( 41) and (42), it might not be too difficult to appropriate the intractable integrals appearing in those equations directly, where the alternative is to use important sampling or sequential importance sampling (SIS).The advantage of SIS is that it does not guarantee to fail as t increases and it becomes more and more skewed, especially when sampling high-dimensional spaces [26].
Thus, we propose a bootstrap PF, which is an iterative method of Bayesian inference for a dynamic state space.The algorithm of the bootstrap PF model is described as follows: (1) Assume is the posterior probability distribution at k − 1 where the transition state (state equation) is (2) Resample , which is the prior probability distribution at k − 1 using the bootstrap algorithm (3) Find a weight by Monte Carlo integration using and to obtain, 3), use to estimate the particle at k − 1 and obtain   (5) Update the likelihood function with and .
(6) From ( 5) use the new updated likelihood to update , which is the posterior probability distribution at k using a normalised weight from time to time.The weight can either be the averaging of all the weights or the last weight only.(7) Finally, we obtain the posterior probability distribution at k.

Simulation and Results
The aim of this section is to compare the performances of all the proposed digital filters and recommend the best fit filter.We use R Programming to simulate the original datasets that has a suite of noise added to it.After completing the simulations, we applied a variety of loss estimations, i.e.,

Adding White-Gaussian Noise
We created six sets of sine wave which is i.i.d.N(0, 2  ) for the frequencies in Hz of 1, 5 and 10 with the amplitude of 10% and 20% of the original datasets, at 100% random distribution.Those parameters were applied to the exchange rates data points x k (t) in time series.The distribution of these added noise are similar to white-Gaussian noise as shown in Figure 6.For ease of presentation, Figure 6 shows the x-axis representing just 100 data points out of the total 2322 data points in the time series, whereas the y-axis is representing the amplitude.We introduced a variety of testing methods, namely; (a) Anderson-Darling, Lilliefors (Kolmogorov-Smirnov) and Pearson chi-square for nonlinear test; and (b) Augmented Dickey-Fuller and Elliott-Rothenberg-Stock for nonstationary test.Referring to Equation (23) x k represented datasets, EUR-USD, exchange rates, in which their nonlinear and nonstationary characteristics were verified by all the testing methods; and w k represented the six sets of sine wave created in Section 3.1.The equation which used to add white Gaussian noise to the original datasets can be rearranged as follows where x kg (t) is the original signal with the noise, x k (t) is the original datasets and () kg t  is the added noise with i.i.d.N(0,1) at section of various frequencies in Hz of 1, 5 and 10 with 100% of random distribution of x k (t).To verify that the Equation ( 43) is nonlinear equation, we tested the x kg (t) with the different parameters of the noise added.As per the results, all of the p-value generated by all the testing methods mentioned earlier were less than 0.05.This served to imply that the characteristics of x kg (t) was nonlinear and nonstationary.Later, x kg (t) was used as input signal to estimate the performance of the following algorithms: (i) aIMF, (ii) WT, (iii) EKF and (iv) PF.

Simulation and Performance Measurements of the aIMF Algorithm
The objective of this section is to measure the performance of the aIMF algorithm compared with the original datasets, i.e., the EUR-USD exchange rates.To test the performance of the proposed aIMF algorithm when filtering the original signals with white Gaussian noise, we applied a variety of loss estimators, i.e., MSE, MAE, MAPE, R 2 and Accuracy count.
At the beginning, we tested the five exchange rates i.e., EUR-USD with Normality and Unit Root tests and found that they are nonlinear and nonstationary time series.Next, we simulated the aIMF algorithm using an R Programming that comprised C++ scripts, which followed the logic in Section 2.2. Figure 7 shows that we produced a new signal, which was filtered by the aIMF algorithm.Having simulated the aIMF algorithm with all noise parameters mentioned in the early of Section 3.1, the results are in the same magnitude and directions.To simplify the graph presentation, we selected the x-axis representing the data points in the time series of overall data range, and two sets of y-axis representing the original datasets, EUR-USD exchange rates (Y2) and the original datasets + noise (Y1) of which shows only the noise's plot of 1 Hz at 10% amplitude (Y1) with 100% random distribution.We analysed the plots and found that the aIMF curve resided within the curve of -the original datasets + noise‖.The trend of those three plots was in the same direction.In terms of estimation, Tables 1-6 indicate that all loss estimators, namely; MSE, MAE, and MAPE are relatively low ranging from 0.00296-0.00596,0.04838-0.06618and 4.08815-5.59678,respectively.
The R 2 is in the high ranging, from 0.8533-0.9228,whereas, AIC and BIC are high, up to −6305.47 and −6974.09,respectively.The Accuracy count showed 57.69%-79.66%.As a result, the performance of the aIMF is acceptable.
We continued the simulations using 5% of random distribution of the added noise instead of 100%.The results showed that the less numbers of the random noise distributed the better performance of the aIMF algorithm.For example, at 5% random distribution of the added noise at 1 Hz with 10% amplitude, the loss estimators, namely; MSE, MAE, MAPE, R 2 , AIC, BIC and Accuracy count were 0.00014, 0.01019, 0.88981, 0.99820, −15723.93,−15706.68,and 95.86%, respectively; and it outperformed those of simulations with 100% random distribution.

Simulation and Performance Measurements of the WT Algorithm
The objective of this section was to measure the performance of the WT algorithm compared with the original datasets, i.e., the EUR-USD exchange rates using the same simulating model and loss estimators employed in Section 3.2.
Similar to Figure 7, Figure 8 shows three plots, i.e., the original datasets (Y2), the -original datasets + noise‖ (Y1) and -WT‖ (Y1).We analysed the plots and found that the WT curve resided within the curve of -the original datasets + noise‖ except a spike occurred at between the data 1112nd to 1213rd rank of the x-axis.The trend of those three plots was in the same direction.In terms of estimation, Tables 1-6 indicate that all loss estimators, namely; MSE, MAE, MAPE are relatively low, ranging from 0.00506-0.00599,0.05402-0.07660and 4.7715-5.79883,respectively.The R 2 is in the below standard ranging from 0.85340-0.8948,whereas, AIC and BIC are relatively high, up to −6890.19 and −6286.15,respectively; whereas the Accuracy counts showed 53.82%-56.13%.As a result, we rated the performance of the WT algorithm is relatively low and hardly acceptable.

Simulation and Performance Measurements of the EKF Algorithm
The objective of this section is to measure the performance of the EKF algorithm compared with the original datasets, i.e., the EUR-USD exchange rates using the same methods employed by Section 3.2.
Similar to Figure 7, Figure 9 shows three plots, i.e., the original datasets (Y2), the -original datasets + noise‖ (Y1) and EKF (Y1).We analysed the plots and found that the EKF curve resided within the curve of -the original datasets + noise‖ except a spike occurred at the beginning of the x-axis.The trend of those three plots was in the same direction.In terms of estimation, Tables 1-6 indicate that all loss estimators, namely; MSE, MAE, MAPE are relatively low ranging from 0.00406-0.01870,0.05546-0.12765and 4.69540-10.7989,respectively.The R 2 is in the ranging from 0.6637-0.9017,whereas, AIC and BIC are high, up to −6429.89 and −6412.64,respectively, and the Accuracy count showed 47.13%-51.66%.As a result, the performance of the EKF algorithm is unacceptable.

Simulation and Performance Measurements of the PF Algorithm
The objective of this section is to measure the performance of the PF algorithm compared with the original datasets, i.e., the EUR-USD exchange rate using the same methods employed by Section 3.2.
Similar to Figure 7, Figure 10 shows three plots, i.e., the original datasets (Y2), the -original datasets + noise‖ (Y2) and PF (Y1).We analysed the plots and found that the PF curve did not agree with the curve of -the original datasets + noise‖ and the original dataset's, inasmuch as fluctuations of those three plots were not the same.The loss estimations emerged by varieties of noise parameters, in terms of estimation, Tables 1-6 indicate that all loss estimators, namely; MSE, MAE, MAPE are relatively low ranging from 4.43079-4.52658,1.64377-1.67868and 139.488-141.868,respectively.The R 2 is unacceptable with the ranging of 0.0049-0.096,whereas, AIC and BIC are relatively too low i.e., −1067.39 and −1046.02,respectively, whereas the Accuracy count showed 49.76%-50.58%.As a result, the performance of the PF algorithm is also unacceptable.

Discussion
Based on the simulations of the algorithms namely; aIMF, WT, EKF and PF, we have found that the aIMF performed the best, following in the order to WT, EKF and PF.Theoretically, the successful application of EMD resides on the fact that the noise is not biased.Therefore, there is not so much of a restrictive constraint, comparing to the scenario of encountering with non-zero mean noise.As mentioned, the characteristics of IMF after several iterations move towards the normal distribution; see Figure 11.Thus, subtracting the averaged IMF with the original signals, given aIMF, can reduce the noise inevitably.However, the proposed aIMF algorithm using cubic spline interpolation does not intend to preserve edges of the datasets/signals.This is because of our target to reduce noise that may associate with the upper and lower boundaries of the curves, which are in time series domain.In this particular case, preserving the edges/curves can unavoidably keep the noise mixing within the signals.Unlike the images whose distribution is random walk, the noise reduction can be achieved while the edges are preserved [27].Referring to Section 3.1, we manually added a variety of noises into the datasets with separate simulations; and later proved that the noises have been removed significantly, displayed in Figure 7 and Tables 1-6.To continue to prove the aIMF algorithm's performance, we simulated the aIMF algorithm with the original datasets-without adding extra noise.The results measured by MSE, MAE, MAPE, R 2 and Accuracy count were 8.20211E−05, 0.00719, 0.57085, 0.9980 and 99.95%, respectively.It is noted that the original datasets, exchange rates, contained a certain level of noise, not pure signal only.In the real application, data of exchange rates are normally fluctuated before closing hours of trading by retailers and speculators who want to manipulate the price.The manipulations are always executed with low volumes of trade, but enable the price changes at the end of trading hours.In the financial community, we deem these trades as noise.Finally, the figures from the last loss estimators shown were similar to the results in Tables 1-6.Hence, it is safe to assume that the proposed aIMF algorithm can remove the unwanted signals.

(c)
On another front, the simulation results from the WT algorithm seemed to be unacceptable under the rationale that Fourier spectral analysis and its derivatives such as WT encountered with a limited time window width by sliding a predetermined window(s) along the time axis [28].Moreover, there is a trade-off between the time consumed in the window width and the frequency resolution, and this phenomenon has been considered by the uncertainty principle Heisenberg [21].In this particular case, the WT's window width must be preselected and it is known as the primary or mother wavelet which is a prototype that provides less flexibility when handling datasets were the mean and variances are highly volatile.In case of EKF simulation, the advantage of SIS is that it does not guarantee to fail as time increases and it becomes more and more skewed, especially when sampling high-dimensional spaces [26].For the PF algorithm, we have found that the number of particles (data-points) was not adequate for Monte Carlo simulation.However, the drawback of the aIMF algorithm is that it requires long time to spline the local extrema.

Robustness Test
Robustness testing is any quality assurance methodology focused on testing the consistent accuracy of software.In this section, we test the algorithms of aIMF, WT, EKF and PF which function as noise reduction models.The testing strategies used different inputs other than the EUR-USD exchange rates with the added noise, which are created from 1 Hz sine wave at 10% amplitude of the original datasets with the 10% random distribution.Those different inputs are EUR-JPY with the added noise, EUR-CHF with the added noise, finally, EUR-GBP with the added noise.Later, we used loss estimators to measure the prediction performances of the proposed aIMF, WT, EKF and PF, i.e., MSE, MAE, MAPE, R 2 , AIC, BIC and Accuracy count.Having simulated with all the loss estimators indicated in Tables 7-9 under the same conditions used to test for EUR-USD as input, the results shared the same trend with few deviations from each other.This served to confirm that the aIMF algorithm performed significantly better when filtering a nonlinear nonstationary time series, i.e., EUR-JPY, EUR-CHF and EUR-GBP exchange rates, followed by WT and EKF algorithms.Moreover, we rejected using the PF algorithm to reduce the noise for the nonlinear nonstationary time series data.Additionally, we have discovered that the EKF and WT algorithm must be optimised in the areas of resampling and building up mother wavelet, respectively.The following configurations were used to perform all the simulations: (i) Intel(R) Xeon(R) server with 2 × 2.4 GHz E5620 CPUs, 3.99 GB RAM and a 64-bit Microsoft Windows Operating System is configured as the main processor.(ii) Sony Visio, Sony L2412M1EB Desktop with an Intel Core i5, 2.5 GHz, 8 GB RAM, and a 64-bit Microsoft Windows Operating System is used as the front-end connection to the data terminal from Bloomberg via web access using a Citrix client.(iii) Application programs written using R programming scripts and some amendments to suit the requirements.
The simulation results showed that there were no bugs in the software scripts, and an average execution time of 3 s for all the Ordinary Least Square (OLS) models.

Conclusion
Noise reduction for a nonlinear nonstationary time series is challenging since the models require a large amount of computational power and more complicated logic than conventional filters.This paper proposed a new filter, the aIMF algorithm, which demonstrated its accuracy and robustness, compared with the WT, EKF and PF algorithms.In this study, we discovered that PF is not suitable for this kind of work.Additionally, the EKF and WT algorithms should be optimised.It may be because the number of particles for sampling and weights were not enough albeit the number of data-points used were more than 2000.Future work to enhance the efficiency of the aIMF algorithm are in the area of (i) optimising the cubic spline algorithm to be more suitable to the input which sometimes persisted to its mean, and (ii) investigating the possibility of designing a DSP chip for the aIMF algorithm.

Figure 2 .
Figure 2. Block diagram of the proposed averaging Intrinsic Mode Function (aIMF) algorithm.
where in state equation (transition) for k > 1, and() kk h y x in measured state (observation) (40)where all states are homogeneous, and the probabilities of transitions and observations are independent of time, see Figure4.

Figure 4 .
Figure 4. HMM showing the transition and observation states.
Mean Square Error (MSE), Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), R 2 , AIC, BIC and Accuracy count which is a sum of the upward and downward movements of all the underlying local signals after they had transited the reversion points.See flowchart of simulation and results in Figure5.

Figure 5 .
Figure 5. Flowchart shows simulation and results of aIMF, WT, EKF and PF.

Figure 6 .
Figure 6.Six sets of sine wave as frequencies in Hz of 1, 5 and 10 with the amplitude of 10% and 20% of exchange rates.

Figure 7 .
Figure 7. Plots of the original datasets, original datasets + noise, and the aIMF algorithm.

Figure 8 .
Figure 8. Plots of the original datasets, original datasets + noise, and the Wavelet Transform (WT) algorithm.

Figure 9 .
Figure 9. Plots of the original datasets, original datasets + noise, and the EKF algorithm.

Figure 10 .
Figure 10.Plots of the original datasets, original datasets + noise, and the Particle Filter (PF) algorithm.

Figure 11 .Figure 11 .
Figure 11.Graphs (a)-(c) show plots of IMF1, IMF5 and IMF7, of which their local extrema of higher order IMF moved toward the normal distribution.

Table 1 .
Performance measurements of original datasets using noise distribution at 100% with 1 Hz and 10% amplitude of the original signals.

Table 2 .
Performance measurements of original datasets using noise distribution at 100% with 1 Hz and 20% amplitude of the original signals.

Table 3 .
Performance measurements of original datasets using noise distribution at 100% with 5 Hz and 10% amplitude of the original signals.

Table 4 .
Performance measurements of original datasets using noise distribution at 100% with 5 Hz and 20% amplitude of the original signals.

Table 5 .
Performance measurements of original datasets using noise distribution at 100% with 10 Hz and 10% amplitude of the original signals.

Table 6 .
Performance measurements of original datasets using noise distribution at 100% with 10 Hz and 20% amplitude of the original signals.

Table 7 .
Performance measurements of original dataset, EUR-JPY, using noise distribution at 10% with 10 Hz and 20% amplitude of the original signals'.

Table 8 .
Performance measurements of original dataset, EUR-CHF, using noise distribution at 10% with 10 Hz and 20% amplitude of the original signals'.

Table 9 .
Performance measurements of original dataset, EUR-GBP, using noise distribution at 10% with 10 Hz and 20% amplitude the original signals'.