Tracking Poisson Parameter for Non-Stationary Discontinuous Time Series with Taylor ’ s Abnormal Fluctuation Scaling

We propose a parameter estimation method for non-stationary Poisson time series with the abnormal fluctuation scaling, known as Taylor’s law. By introducing the effect of Taylor’s fluctuation scaling into the State Space Model with the Particle Filter, the underlying Poisson parameter’s time evolution is estimated correctly from given non-stationary time series data with abnormally large fluctuations. We also developed a discontinuity detection method which enables tracking the Poisson parameter even for time series including sudden discontinuous jumps. As an example of application of this new general method, we analyzed Point-of-Sales data in convenience stores to estimate change of probability of purchase of commodities under fluctuating number of potential customers. The effectiveness of our method for Poisson time series with non-stationarity, large discontinuities and Taylor’s fluctuation scaling is verified by artificial and actual time series.


Introduction
The Poisson process is a basic stochastic process for events that occur at random in various natural and social phenomena, such as the number of decay of radioactive atoms, the occurrence of a failure of elements in devices, and daily sales amount of commodities [1].As the Poisson process is fully described by just one parameter, λ, the mean value of events in a unit time, precise estimation of λ for a given data set is the key to understand the process.
Estimation of λ is an easy task if the data holds stationarity, as λ is given simply by the average number in a unit time.However, for any real system, we encounter the following two difficulties.
The first difficulty comes from non-stationarity.A wide variety of natural and social phenomena exhibit non-stationarity.The fluxes of cosmic radiation, such as active galactic nuclei and X-ray, are non-stationary on long time scales [2].Insect population shows non-stationarity by the factors such as environmental changes and dynamics of the ecological system [3].As for social phenomena, web page visitations, highway traffic [4] and the underlying demand for a commodity [1] changes non-stationarily by various uncontrollable factors like human activity, social trend and environmental condition.Thus, a time series in real systems generally shows complicated non-stationarity which is often unpredictable.Especially, in the case that the time scale of change of λ is short including discontinuous jumps, it is very difficult to estimate λ by the conventional methods such as the Poisson regression method [5].
The second difficulty relates to the abnormally large fluctuations whose standard deviation is proportional to λ, while the width of fluctuation of the Poisson process is known to be given by √ λ.Taylor found the power law [6] between the fluctuation and the mean value in observing the number of individuals in the ecosystem.This fluctuation scaling, known as Taylor's law, has been confirmed in a variety of natural and social phenomena [7], such as biology [8], financial market [9], networks [10], frequencies of word appearances on internet blogs [11] and sales [1].The mechanism of the fluctuation scaling is known as follows.The basic process follows a Poisson process with a constant λ, but there exists fluctuation in the population of the observing objects, that gives the standard deviation of fluctuation proportional to the mean value, λ [1,[10][11][12].This fluctuation scaling is unavoidable in many cases since the total number of the observing objects is practically uncontrollable in the real system.For precise estimation of λ we have to distinguish the variation of λ by this effect from a non-stationary temporal change of λ.However, Taylor's fluctuation scaling has not been attracted attention in the Poisson parameter estimation.
Non-stationary parameter estimation methods have been widely studied, such as Auto Regressive Integrated Moving Average (ARIMA) [13], Neural Networks (NN) [14], which are suitable for non-stationarity with regularity characterized by autocorrelations.Generalized Additive Model (GAM) [15] can reveal the general trend of a non-stationary time series by fitting a smooth curve which is differentiable.The State Space Model (SSM) [16] is applicable for unpredictable non-stationarity with abrupt and indifferentiable changes, therefore in this paper we generalize this method to handle non-stationary Poisson time series with Taylor's fluctuation law.
In Section 2, we explain our method with short reviews of the non-stationary time series analyses to be used in our method.In Section 3, we verify our method by some artificial Poisson time series.In Section 4, we examine the effectiveness for actual data, namely Point-of-Sales (POS) data which shows non-stationarity and the fluctuation scaling with unpredictable sudden jumps.

Non-Stationary Time Series Analysis
A time series analysis method suitable for non-stationary cases, the State Space Model (SSM) [16] is adopted in our method.In this subsection, we explain the SSM.
The SSM assumes that an observation value y t at time t is generated stochastically by an invisible state x t , where x t is stochastically determined depending on x t−1 .
where Φ(A|B) denotes the Probability Density Function (PDF) of A conditioned by B, and the tilde (∼) in the formula, C ∼ Φ(A|B) denotes that C is a stochastic variable following the PDF.
Equation ( 1) is called the system model in the SSM, which describes stochastic time evolution of the state x t .Equation ( 2) is called the observation model in the SSM, which defines the stochastic generation of the observation value from the state.
Non-stationarity of the Poisson mean value λ can be modeled in the system model Equation (1), regarding λ as the state x t .We define the system model as follows: where m, α, β are hyper parameters.N(0, α • x t−1 ) is the Normal distribution with the mean value equals to zero and the standard deviation equals to α • x t−1 (0 < α).U(−β, β) is the truncated Uniform distribution with the range from -β to β (0 < β).m characterizes the ratio of superposition of two distributions, the Normal and truncated Uniform distributions (0 ≤ m ≤ 1).Equation (5) means that the stochastic time evolution of the change of v t generally follows either the Normal distribution with probability 1 − m or the Uniform distribution with range of 2 • β with probability m.The value of β is determined as the range of the Uniform distribution is much larger than the fluctuation of the Normal distribution (α • x t−1 ), that allows a large change of x t .The superposition of two distributions considers the case where x t generally follows the Normal distribution, but sometimes changes up to β with a small probability m.This kind of system model with long tail distribution allows better trackability of a non-stationarity.A system model defined by the superposition of a Normal distribution and a truncated Uniform distribution is already proposed by Yura et al. [17] for non-stationary time series analysis of financial market data.Here, in our model the value of x t is limited to be non-negative by Equation ( 4) as the Poisson parameter λ (or x t ) needs to be positive.The hyper parameters are to be determined in accordance with non-stationarity of Poisson λ for each system.We adopted m = 0.05, α = 0.005 and β = 2.5 • σ t−1 by examining Root Mean Squared Error of λ estimation of some artificial time series, such as stationary, step-like and continuous rising.σ t−1 is Taylor's fluctuation term to be described in the next subsection, Equation (6).The detailed information on the hyper parameters determination is provided in the Appendix A.

Taylor's Fluctuation Scaling Law
The standard deviation of a Poisson process with Taylor's fluctuation scaling, which is caused by the fluctuation in the population of the observing objects, is generally given as follows [18]: Here, the proportional constant γ characterizes the strength of fluctuation of population and its value is determined by examining the relationship between the standard deviation and the mean of an actual data.
As an example, we explain a case of Point-of-Sales (POS) data.The POS data is taken from 243 chain stores of a major convenience store company in Japan during the 153 days from 1 June to 31 October 2010.We obtained daily sales time series of each product in each store from the POS data.
Figure 1 illustrates Taylor's fluctuation scaling of sales data of a product (a soft drink).The solid line indicates σ = λ + (γ • λ) 2 where γ = 0.12, the dot-dash line is σ = √ λ and the dashed line 1.For small mean values we prepare data sets of daily sales numbers of the product for each store with specification by the day of the week (For example, sales number in every Monday).243 (stores) × 7 (days of the week) = 1701 points are plotted.2. For larger mean values we prepare aggregated data sets by random sampling as follows.For each given number of stores, k, we choose k stores at random and the sales numbers are summed up for each day of the week.The value of k is from 2 to 243 and we repeat this procedure 30 times for each k (48,223 points are plotted.) The γ value can be estimated by non-linear regression analysis [19].Specifically, γ is determined by using the least-square method which minimize the difference between the σ obtained by Equation ( 6) and the actual σ in Figure 1.  2 , where γ = 0.12.

SSM of Our Method
We incorporate Taylor's fluctuation scaling term into the SSM.The observation model Equation ( 2) for the Poisson process can be written as follows: where Po(x t ) is the Poisson distribution with the mean value x t .The observation value y t are generated stochastically by the distribution.Here, x t is stochastically determined by Equations ( 3)- (5).
In order to incorporate Taylor's fluctuation scaling term Equation ( 6) as the standard deviation of Equation ( 7), we approximate the Poisson distribution by the Normal distribution in the case that x t is statistically large (i.e., x t ≥ 20).The Normal distribution for x t = 10 and σ = √ 10 is very similar to the Poisson distribution for x t = 10, and for x t ≥ 20, the distributions are virtually indistinguishable [20].We apply the approximation by the normal distribution for x t ≥ 20 since the skewness of the Poisson distribution of x t = 20 is 30% smaller than that of x t = 10, hence, closer to 0, which is the value of skewness of the normal distribution, whereas the ratio of change of the standard deviation by Taylor's fluctuation scaling at x t < 20 is below 10%, where γ = 0.1.
Thus, the likelihood of x t for Equation (7) and Equation ( 8) are written as PDF of each distribution, where

Particle Filter
Our SSM contains non-Gaussian distribution such as the Poisson distribution and the truncated Uniform distribution.The Particle Filter algorithm [21,22] has the capability to solve such a non-Gaussian SSM.
The Particle Filter is a kind of Monte Carlo simulation which approximates a PDF of the state x t as the distribution of Monte Carlo sample values.Owing to the property of the Monte Carlo simulation, an arbitrary distribution function can be represented.The Monte Carlo samples are referred to as the particles in the Particle Filter.The distribution of the particles is updated by the observation value y t for each time t with the likelihood function.The set of the particles at time t conditioned by an observation value at time k is written in the form of {x , where N is the Monte Carlo sample size.The specific procedure of the Particle Filter which corresponds to the simulation code is as follows, i=1 by Equations ( 3)-( 5).(b) Using the observation value y t , estimate likelihood p(y t |x (i) t|t−1 ) for each particle by Equation ( 9) (c) Update particles by resampling with the replacement of {x t for each particle are proportional to the likelihood of each particle.
(d) If time t is the last step, stop the procedure.Otherwise, increment t and go to step 2 (a).
In this research, the number of Monte Carlo samples is set to N = 10,000 and the initial particles are generated by Equations ( 3)-( 5) assuming x −1 = y 0 for simplicity.We obtain a λ value for each time t by calculating the median, instead of mean, of the Monte Carlo samples of x t since its robustness against variations of particle values.
The Markov Chain Monte Carlo (MCMC) method [23] is another candidate to solve our SSM.We selected the Particle Filter since it can solve the SSM stably even with a small amount of data.

Particle Filter with Discontinuity Detection
Our SSM introduce a wide range of Uniform distribution to follow large parameter changes in time series as shown in Equation (5).Moreover, we consider the cases that include discontinuous jump points at which time series value changes more than the range of the Uniform distribution.
Simply widening the range of the Uniform distribution results in reducing the density of the particles and increasing Monte Carlo error.We propose the following discontinuity detection and estimation correction method.
1. Detect a discontinuous point by checking whether the observation value is out of the bound of the particle distribution.2. If a discontinuous point is detected, initialize the particle distribution with the observation value at the discontinuous point.
Using the observation value y t itself at the discontinuous point as the median of the particle distribution leads to overfit the data.We adopt y t − σ t or y t + σ t at the upward or downward discontinuous point respectively.Accordingly, the threshold of discontinuous point detection is set to be the upper bound +σ t or lower bound −σ t of the particle distribution.
We modify the algorithm of the Particle Filter shown in Section 2.4, step 2 (b) as follows,

At time t:
(a) Generate prediction distribution of particles {x t|t−1 } N i=1 by Equations ( 3)-( 5).(b) If the observation value y t is above the upper bound +σ t or below the lower bound −σ t of the prediction distribution, go to step 1 and set x 0 to y t − σ t or y t + σ t at the upward or downward discontinuous point respectively.Otherwise, using the observation value y t , estimate likelihood p(y t |x t|t−1 ) for each particle by Equation ( 9) and go to next step . (c) Update particles by resampling with the replacement of {x . The resampling probabilities ρ (i) t for each particle are proportional to the likelihood of each particle.
(d) If time t is the last step, stop the procedure.Otherwise, increment t and go to step 2 (a).

Summary of the Parameter Estimation Procedure
We summarize the parameter estimation procedure.Steps 1 and 2 are preparations, and step 3 is the actual parameter estimation step.
1. Estimate the proportional constant γ of Taylor's fluctuation scaling by some previous data, as explained in Section 2.2. 2. Determine the hyper parameters in the system model Equation ( 5) in SSM, by using some artificial time series as shown in Appendix.3. Apply the Particle filter described in Sections 2.4-2.5 to solve the SSM, and estimate Poisson λ parameter of a time series.Poisson λ parameter is estimated basically based on the likelihood for an observation data at each time t.When the discontinuous trend jump is detected as explained in Section 2.5, the value of λ is estimated based on the observation data.No extrapolation is done, namely, we do not estimate the parameter where there is no data.

Simulation Tests
In this section, we verify the validity of our method with artificial time series generated by random number simulation.The distribution of random numbers is the Poisson distribution for small λ, i.e., λ < 20, otherwise the Normal distribution with Taylor's fluctuation scaling term Equation ( 6) where γ = 0.1.The γ value 0.1 is typical for the Point-of-Sales data used in this research [1].Random numbers with a specific distribution can be obtained by the inversion method [24].

Validity for Non-Stationarity
First, we demonstrate our method without discontinuity point detection, which is described in Section 2.5. Figure 2a shows a continuous rising time series from λ = 20 to λ = 200 in gray line.In this figure, the assumed values of λ and the estimated values of λ are shown in dotted line and black line respectively.We can confirm that the estimated values of λ follows the non-stationary continuous rising trend of the assumed values of λ.The Root Mean Squared Error (RMSE) between the assumed values of λ (λ A ) and the estimated values of λ (λ E ), standardized by λ A at each time t (1 ≤ t ≤ N) as shown in Equation ( 12), is 7.74%.This delay is caused by the large discontinuity of the time series which is not covered by the range of the prediction distribution, that is, the range of the truncated uniform distribution in Equation (5).By introducing the discontinuous point detection as mentioned in Section 2.5, the delay is corrected as shown in Figure 3.The RMSE λ is improved to be 6.55%.The result is shown in Figure 4b.The plots are along the relationship of Taylor's fluctuation scaling illustrated in black solid line.This ensures the validity of our model for Taylor's fluctuation scaling.The RMSE between the theoretical standard deviation (σ T ) by Equation ( 6) and the estimated standard deviation (σ E ), standardized by σ T at each time t(1 ≤ t ≤ N) as shown in Equation ( 13), is 8.71%.

Validity for Taylor's Fluctuation Scaling
To clarify the advantage of our model to the conventional Poisson model, we performed an λ estimation of the same time series as in Figure 4a with the conventional Poisson model, that is, with Equation ( 7) instead of Equation ( 8).The results are shown in Figure 4c,d Along with the step-like time series, we examined the case of a continuous rising time series.Figure 5a,b are the estimated results by our method with Taylor's fluctuation scaling term.The RMSE σ is 8.65%.Figure 5c,d

Point-of-Sales (POS) Data Tests
We verified our method by actual data, namely, the POS data described in Section 2.2.The POS data is a typical example of the non-stationary Poisson time series with Taylor's fluctuation scaling [1].
Figure 6a shows a daily sales number series of a product (an ice cream) at a store.One can see the arch-shaped trend in the sales time series in gray line, and the estimated values of λ in black line follows the trend.Figure 6b clarifies that the relationship between the mean and the standard deviation of the estimated values of λ values are along the relationship derived from Taylor's fluctuation scaling Equation (6).The RMSE σ is 16.56%.Figure 7a indicates a sales time series of a product (a meat bun) of a store which shows sudden rises at around time 10 and 52.The estimated values of λ in black line well follows the sudden rises.Figure 7b shows the estimated values of λ are along the relationship of Taylor's fluctuation scaling.The RMSE σ is 18.74%.  , where γ = 0.12 (a fried item) [1].
Figure 7c,d show the estimation results of the same time series in Figure 7a, without using the correction method at discontinuous points.We can find that the estimated values of λ in black line are underfitted to the time series in gray line at the large discontinuous points.The RMSE σ is 337.62%.Thus, the merit of the correction method at discontinuous points is verified with the actual time series.

Conclusions
The effectiveness of our method of tracking the Poisson parameter for abnormally large fluctuation time series with non-stationarity is verified one by one with artificial time series.We revealed the advantage of our method considering the term of Taylor's fluctuation scaling to the conventional Poisson model, namely, our model distinguishes non-stationarity of the parameter from the large fluctuation caused by this fluctuation scaling.We also presented the method of detecting discontinuity of the time series.By correcting estimation values at the discontinuous point in time series, we realized precise parameter estimation with non-stationary time series with large discontinuity.Along with some artificial time series, we verified the effectiveness of our method by some actual Point-of-Sales data which shows non-stationarity, large discontinuities and Taylor's fluctuation scaling.
We assume the case that Taylor's fluctuation scaling is generally caused by the fluctuation in the population of the observing objects, which leads the standard deviation to be proportional to Poisson mean value λ, namely, the scaling exponent becomes 1.The scaling exponent 1 is observed in many natural and social phenomena, such as cell numbers of species [25], fluxes of cosmic radiation [2], daily water-level of a river [26], web page visitations [4], highway traffic [4] and sales [1].
Considering effects other than fluctuation in the population, there are cases that the Taylor's scaling exponent takes a value between 0.5 and 1.0 as we can find examples in POS data on stamps and magazines [1].Stamps are often purchased in bulk, and magazines are specifically bought on the launch day.The purchase processes of these items are not regarded as a simple Poisson process, that leads to the scaling exponent under 1.The extension of the proposed model for more general Taylor's fluctuation scaling is a potential future research task.
Periodicity in the time series, such as seasonality and circadian variation behind the daily data, are not considered in the present model.In practical applications in the future, applying the method to some cases such as trend forecast, an extension of our method taking into account such periodicity may yield better accuracy of the parameter estimation.
Our method deals with parameter estimation of a non-stationary Poisson time series of number of events observing with regular intervals in the case that the population is also changing randomly.Meanwhile, there exist methods based on Poisson point process suitable for describing events in discrete points in time.The Spatial Mixed Poisson Process(SMPP) [27][28][29][30][31] is such an example, which can handle the case that the underlying random event is correlated to the observing event that is stochastically determined.The SMPP can model the delay of the impact from the underlying event to the observing event since both events are described separately in discrete points in time, which is suitable for application to rare but correlated events such as occurrence of cascading crashes in financial markets.There are wide variety of phenomena both in nature and in our society, which are considered to be described basically by a variant of Poisson process.More different approaches will appear in the near future to cope with the increasing big data in the society.reasonable because small m, α and β lead to a narrow prediction distribution and less sensitivity to variations in a stationary time series.
The results for step-like change shown in Figure A2 are almost opposite parameter dependencies to Figure A1 since a wide prediction distribution contributes to sensitiveness to the change.One should note that a prediction distribution wider than necessary results in large RMSE because of the Monte Carlo error.The results for continuous rise in Figure A3 shows generally similar parameters dependence as Figure A2, in the cases of too small or too large m, α and β result in large RMSE.
We adopted m = 0.05, α = 0.005 and β = 2.5 • σ t−1 , marked with yellow rectangles in Figures A1-A3 , since these are one of the balanced condition giving small RMSE for three types of time series examined.

Figure 1
is obtained by the following procedures.

Figure
Figure 2b indicates a step-like time series which increases from λ = 20 to 200 in gray line.The estimated values of λ in black line generally follows the abrupt step-like change of the assumed

Figure 2 .
Figure 2. Estimation results of two types of artificial non-stationary time series using the particle filter without discontinuous detection shown in Section 2.4.(a) Continuous rise time series from λ = 20 to 200.(b) Step-like time series from λ = 20 to 200.Gray line: artificial time series, Black dotted line: assumed values of λ, Black solid line: estimated values of λ.

Figure 3 .
Figure 3. Correction of estimated values of λ at the discontinuous point using the particle filter with discontinuous detection shown in Section 2.5.Gray line: artificial time series, Black dotted line: assumed values of λ, Black solid line: estimated values of λ.

Figure
Figure4ashows a step-like time series from λ = 10 to 640 increased by two times every 30 data points which is shown in gray line, and the estimated values of λ in black line.The RMSE λ is 8.66%.

Figure 4 .
Figure 4. Verification for Taylor's fluctuation scaling with step-like time series from λ = 10 to 640.(a) Black solid line: estimated values of λ with our model, Gray line: artificial time series, Black dotted line: assumed values of λ (b) The mean and the standard deviation plots of (a).(c) Black solid line: estimated values of λ with conventional Poisson model, Gray line: artificial time series, Black dotted line: assumed values of λ.(d) The mean and the standard deviation plots of (c).In (b,d), Dot-dash line: σ = √ λ, Dashed line: σ = γ • λ, Solid line: σ = λ + (γ • λ) 2 , where γ = 0.1.
. One can see that the estimated values of λ time series in Figure 4c follows the fluctuation of the time series excessively.The conventional Poisson model regards large fluctuation caused by Taylor's fluctuation scaling as variation of λ. Figure 4d illustrates that the Poisson model estimates the fluctuation as √ λ in dot-dash line, while the true fluctuation is Equation (6) in solid line.The RMSE σ is 33.73%.
are the results by the conventional Poisson model.The RMSE σ is 31.71%.These results also demonstrate the effectiveness of our method for estimating the Poisson λ with Taylor's fluctuation scaling.

Figure 5 .
Figure 5. Verification for Taylor's fluctuation scaling with continuous rising time series from λ = 10 to 600.(a) Black solid line: estimated values of λ with our model, Gray line: artificial time series, Black dotted line: assumed values of λ.(b) The mean and the standard deviation plots of (a).(c) Black solid line: estimated values of λ with conventional Poisson model, Gray line: artificial time series, Black dotted line: assumed values of λ.(d) The mean and the standard deviation plots of (c).In (b,d), dot-dash line: σ = √ λ, dashed line: σ = γ • λ, solid line: σ = λ + (γ • λ) 2 , where γ = 0.1.

Figure
Figure 6c,d show the estimation results of the same time series in Figure 6a, with the conventional Poisson model.The Poisson model results in overfitting of the λ to the time series as shown in Section 3.2.The RMSE σ is 35.40%.The advantage of incorporating the term of Taylor's fluctuation scaling in precise estimation of the Poisson parameter is confirmed with this actual data.Figure7aindicates a sales time series of a product (a meat bun) of a store which shows sudden rises at around time 10 and 52.The estimated values of λ in black line well follows the sudden rises.Figure7bshows the estimated values of λ are along the relationship of Taylor's fluctuation scaling.The RMSE σ is 18.74%.

Figure 7 .
Figure 7. Estimation results of a sales time series with discontinuity.(a) Gray line: sales time series, Black line: estimated values of λ with our model.(b) The mean and the standard deviation plots of (a).(c) Gray line: sales time series, Black line: estimated values of λ with our model without the correction method at discontinuous points.(d) The mean and the standard deviation plots of (c).In (b,d), dot-dash line: σ = √ λ, dashed line: σ = γ • λ, solid line: σ= λ + (γ • λ)2 , where γ = 0.12 (a fried item)[1].

Figure A1 .
Figure A1.RMSE result of stationary time series.(a) An example of the stationary time series.Gray line: artificial time series, Black dotted line: assumed values of λ.(b) m = 0.05, where m is the ratio of the Uniform distribution to the Normal distribution, (c) m = 0.10, (d) m = 0.15.The yellow square shows the parameter sets we adopted in our simulation and real data analysis.

Figure A2 .
Figure A2.RMSE result of non-stationary, step-like time series.(a) An example of the step-like time series.Gray line: artificial time series, Black dotted line: assumed values of λ.(b) m = 0.05, (c) m = 0.10, (d) m = 0.15.The yellow square shows the parameter sets we adopted in our simulation and real data analysis.

Figure A3 .
Figure A3.RMSE result of non-stationary, continuous rising time series.(a) An example of the continuous rising time series.Gray line: artificial time series, Black dotted line: assumed values of λ.(b) m = 0.05, (c) m = 0.10, (d) m = 0.15.The yellow square shows the parameter sets we adopted in our simulation and real data analysis.