Change Point Detection for Diversely Distributed Stochastic Processes Using a Probabilistic Method

: Unpredicted deviations in time series data are called change points. These unexpected changes indicate transitions between states. Change point detection is a valuable technique in modeling to estimate unanticipated property changes underlying time series data. It can be applied in different areas like climate change detection, human activity analysis, medical condition monitoring and speech and image analyses. Supervised and unsupervised techniques are equally used to identify changes in time series. Even though change point detection algorithms have improved considerably in recent years, several undefended challenges exist. Previous work on change point detection was limited to speciﬁc areas; therefore, more studies are required to investigate appropriate change point detection techniques applicable to any data distribution to assess the numerical productivity of any stochastic process. This research is primarily focused on the formulation of an innovative methodology for change point detection of diversely distributed stochastic processes using a probabilistic method with variable data structures. Bayesian inference and a likelihood ratio test are used to detect a change point at an unknown time ( k ). The likelihood of k is determined and used in the likelihood ratio test. Parameter change must be evaluated by critically analyzing the parameters expectations before and after a change point. Real-time data of particulate matter concentrations at different locations were used for numerical veriﬁcation, due to diverse features, that is, environment, population densities and transportation vehicle densities. Therefore, this study provides an understanding of how well this recommended model could perform for different data structures.


Introduction
Unexpected deviations in time series data are called change points.These sudden changes indicate transitions between states.Change point detection is worthwhile in modeling, to estimate unexpected property changes underlying time series data.It is applicable in different areas like climate change detection, human activity analysis, medical condition monitoring and speech and image analyses.Supervised and unsupervised techniques are equally used to identify changes in time series.Even though change point detection algorithms have improved considerably in recent years, several undefended challenges exist [1].
Several techniques have been recommended for the identification of undocumented change points in climate data sequences [2].A change-point analysis technique has been described and its potential applications have been highlighted through a number of examples [3].The kernel-based change point (KCP) detection procedure can only be used to detect a particular type of change; therefore, based on the Gaussian KCP method, a new nonparametric approach was proposed for predicting correlation changes, called KCP-corr.KCP-corr performs better than the Cusum technique, which specifically aims to identify correlation changes [4].A generalized likelihood ratio test (GLRT) was used for detecting changes in the mean of a one dimensional Gaussian process [5].A new method was recommended for change point detection in a Brownian motion, with a time-dependent diffusion coefficient in fractional Brownian motion [6].A production inventory model with probabilistic deterioration was developed in two-echelon supply chain management [7].A two stage change point detection technique in machine monitoring was suggested [8].Bayesian Approach was used for change point detection of polluted days [9].
A statistical change point algorithm was proposed in which direct density ratio estimation technique was used for deviation measurement of nonparametric deviation estimation among time series samples through relative Pearson divergence variable data structures [10].An innovative statistical approach for online change point detection was recommended in which estimation method could also be updated online [11].An economic production quantity model with stochastic demand was developed for an imperfect production system [12].For a change point test in a series, the Karhunen-Loeve expansion of the limit Gaussian processes was recommended [13].The test for sudden changes in random fields was presented as a Cramer-von Mises type test and was dependent on the Hilbert space theory [14].An integrated inventory model was developed to determine the optimal lot size and production uptime while considering stochastic machine breakdown and multiple shipments for a single-buyer and single-vendor [15].A new methodology was introduced for the identification of structural changes in linear quantile regression models because the conventional mean regression technique couldn't be appropriate for the identification of such structural changes at tails [16].Supply chain model with stochastic lead time, trade-credit financing and transportation discounts was developed in order to make a coordination mechanism between transportation discounts, trade-credit financing, number of shipments, quality improvement of products and reduced setup cost in such a way that the total cost of the whole system can be reduced, where the supplier offers trade-credit-period to the buyer [17].The fuzzy classification maximum likelihood change point (FCML-CP) algorithm was suggested for detection of simultaneous multiple change points in the mean and variance of a process and it reduces analysis time [18].For sequential data series, a Bayesian change point algorithm was presented but it had unreliable restrictions for a number of change points and their location [19].
The Bayesian change point detection (BCPD) technique being suggested in this research paper can overcome challenges in identifying the location and number of change points due to the probabilistic concept.This methodology would precisely be based on posterior distributions and likelihood ratio test to deduce if a change point has occurred.It can also update itself linearly as new data points are observed.Posterior distribution monitoring is the best way to identify the presence of a new change point in observed data points.Simulation studies illustrate that this algorithm is good for rapid detection of existing change points and it also has a low rate of false detection, [19].Previous work on change point detection was limited.Therefore, more studies are required to investigate appropriate change point detection techniques that are applicable to any data distribution to assess the numerical productivity of any stochastic process.This research is primarily focused on formulation of an innovative methodology for change point detection of diversely distributed stochastic processes by a probabilistic method with variable data structures.The parameter expectations before and after change point must be critically analyzed so that the parameter change can be evaluated.Bayesian inference and the likelihood ratio test are used to detect a change point at an unknown time (k).
Real-time data of particulate matter concentrations at different sites were used to validate the proposed approach.Investigation of particulate matter (PM) pollution status was conducted to evaluate the long-term trends in Seoul which shows a decreasing trend during the study period (2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013) [20].Long-term behavior of particulate matters at urban roadside and background locations in Seoul, Korea were analyzed and the mean PM values exhibit a slight fall over the decade [21].Probabilistic method was used to comprehensively analyze the change point (k), parameters before the change point (µ 1 , µ 2 , ..., µ n ) and parameters after the change point (η 1 , η 2 , ..., η n ).Hence, simulation models were built on diverse data structures of different areas to consider different features, that is, environment, population densities and transportation vehicle densities.Therefore, this study delivers a vision about how well this suggested model could perform in different areas.The paper is arranged along these lines: Section 2 discusses a literature review regarding Bayesian change point detection, while Section 3 refers to problem definitions, explaining assumptions and notations and demonstrates the formulation of mathematical models.Sections 4 and 5 depict real world application of the model and results to validate practical application of the proposed models.Section 6 discusses the results for each area; finally, Section 7 presents conclusions of this study.

Related Literature
A basic literature review for Bayesian change point methodology was performed.An approach was proposed to detect changes in a non-homogeneous Poisson process and it was used to detect if a change in event rate has occurred, the time of the change and the event rate before and after the change [22].A novel Bayesian approach was suggested to detect abnormal regions in multiple time series.Model was built and revealed that posterior distribution was used for independent sampling to conclude Bayesian inference.This approach was evaluated for simulated CNVs (copy number variations) and real data to confirm that this methodology is more accurate as compared to other methods [23].An economic manufacturing quantity model with probabilistic deterioration was developed for a production system [24].A comparison of Expectation Maximization (EM) method and Bayesian method for change point detection of multivariate data was done.The Bayesian technique involves fewer computational work, while EM reveals better performance for unsuitable priors and minor changes [25].Min-max distribution free continuous-review model was presented with a service level constraint and variable lead time [26].The Bayesian change point detection model was recommended to identify the flooding attacks in VoIP systems in which the Session Initiation Protocol (SIP) is used as a signaling mechanism [27].
To acquire accurate and reliable change detection maps for land cover monitoring, a new post classification methodology with iterative slow feature analysis (ISFA) along with Bayesian soft fusion was proposed.This methodology included three steps, first one to define the probability class of images, then a continuous change probability map and last posterior probabilities for the class arrangements of coupled pixels [28].An economic production quantity model was developed with random defective rate, rework process and backorders for a single stage production system [29].A Bayesian change point detection methodology was developed to analyze biomarker time series data in women for earlier diagnosis of ovarian cancer [30].A method for approximation of digital planar curves with line segments and circular arcs using genetic algorithms was proposed [31].The Generalized Extreme Value (GEV) fused lasso penalty function was used to detect change points for annual maximum precipitation (AMP) in South Korea.A comparison between GEV fused lasso and Bayesian change point analysis was conducted, which depicted that GEV fused lasso method should be used if water resource structures are hydrologically designed [32].Mathematical models were developed for work-in-process-based inventory by incorporating the effect of random defects rate on lot size and expected total cost function [33].An innovative Bayesian approach was suggested to detect change points in extreme precipitation data, while the model was based on a generalized Pareto distribution.Four different situations were used for analysis, first with no change, second with a shape change, third with a scale change and fourth with both shape and scale change [34].See Table 1 for comparison of studies of different authors and for the difference in previous works and this work.[11] autoregressive dynamic models, --CUSUM-like scheme --

Problem Definition
This research is primarily focused on formulation of a unique methodology for change point detection of diverse data structures following any kind of distribution at any unknown time (k) at any area across the globe.The existing procedures for change point detection are either very complicated or no applicable to stochastic processes and random time series.That's why, a more precise, well defined and easily applicable approach for change point detection of stochastic processes and random time series has been proposed.Second, analysis of these changes need to be conducted, whether or not these change points are favorable.For this, a comparison of distribution parameters before and after a change point has to be performed for evaluation of subjected change.Third, an alteration in parameters expectations must be measured to define new policies for further improvements in the current states.For anticipated goals, the Probabilistic method will be used to determine posterior probabilities of data and the change point in that Bayesian model will be identified through likelihood ratio test.This suggested model will be numerically validated by using real-time data of particulate matter concentrations and particulate matter hazards in different areas of Seoul, South Korea, observed from January 2004 to December 2013.The change point (k) for particulate matter (PM 2.5 and PM 10 ) daily concentrations, the parameters before the change point (µ 1 , µ 2 , ..., µ n ) and the parameters after the change point (η 1 , η 2 , ..., η n ) are comprehensively analyzed.The central idea for using different regions is their considerably different features, that is, environment, population densities and transportation vehicle densities.Hence, this study can also be the basis for implementation of the recommended model in different areas.Later, this probabilistic method is verified by the CUSUM approach.The results of the CUSUM approach are compared with the probabilistic method.
1.The probabilistic method is based on probability distributions, which can be applicable to data distribution.In this case, first define the data distributions and then apply proposed method to attain the results.This methodology is better to apply for random data structures and time series.2. The CUSUM approach is directly applicable to the raw data, which is good for deterministic data structures.

Notations
The list of notations to represent the random variables and parameters are as follows. Indices cumulative sum

Assumptions
The following assumptions were used for the proposed model: 1.
Y represents the random data at given time t and this random data series is distributed at state space y ∈ 1, 2, ..., n that can be any random value.

2.
Y(0) = 0 means that no event occurred at time t = 0, while time series random data are observed on intervals of equal length.

3.
The random data structure follows a specific probability distribution function, in any interval of length (t), resulting in a random variable with parameters (µ 1 , µ 2 , ..., µ n ).

Formulation of Change Point Detection Model
The probability distribution function of a random variable Y with the parameters µ 1 , µ 2 , ..., µ n at any specific point y is given as follows After defining the probability distribution function of random process Y. Now, divide the process in two segments; first segment defines the process before change point and second segment defines the process after change point.Let the change point in the random process Y be denoted by k and (µ 1 , µ 2 , ..., µ n ) be the random variable parameters before change point k, while (η 1 , η 2 , ..., η n ) are the random variable parameters after change point k.
The joint probability function is the product of a marginal probability function.If random variable Y = y i with parameters (µ 1 , µ 2 , ..., µ n ) is modeled, then the joint probability function of the sample data will be as below: A class of prior densities is conjugate for the likelihood/sampling model p(y i |µ 1 ) if the posterior probability distribution is in the same class.Therefore, prior distribution p(µ 1 ) and posterior distribution p(µ 1 |y i ) will follow the same conjugate prior distribution as the likelihood/sampling model p(y i |µ 1 ).However, the likelihood p(y i |µ 1 ) follows a random distribution based on data.The Bayes theorem can be used to determine the posterior probability p(µ Posterior probability ∝ Prior probability × Likelihood Bayesian inference for multiple unknown parameters is not conceptually different from the one-parameter case.For any joint prior distribution p(µ 1 , µ 2 , ..., µ n ), posterior inference proceeds using Bayes' rule: The inference for this multi-parameter model can be broken down into multiple one-parameter problems.First, make an inference for µ 1 when remaining parameters (µ 2 , ..., µ n ) are known and use a conjugate prior distribution for µ 1 .For any (conditional) prior probability p(µ 1 |µ 2 , ..., µ n ), the posterior distribution will satisfy The posterior parameters combine the prior parameters with terms from the data: Posterior information = Prior information + Data information Hence, the prior distribution and sampling model are as follows: Thus, the posterior inference for first parameter µ 1 and η 1 can be given by Just as the prior distribution for µ 1 and µ 2 , ..., µ n can be decomposed as p(µ 1 , µ 2 , ..., µ n ) = p(µ 1 |µ 2 , ..., µ n )p(µ 2 , ..., µ n ), the posterior distribution can be similarly decomposed: Similarly, p(η 1 , η 2 , ..., η n |y k+1 , y k+2 , y k+3 , ...., y n ) after a change point can be given by p(η 1 , η 2 , ..., η n |y k+1 , y k+2 , y k+3 , ...., y n ) = p(η 1 |y k+1 , y k+2 , y k+3 , ...., y n , η 2 , ..., η n ) p(η 2 , ..., η n |y k+1 , y k+2 , y k+3 , ...., y n ) The conditional distribution of µ 1 given µ 2 , ..., µ n and the data (y 1 , y 2 , y 3 , ...., y n ) was obtained in previous sections.The posterior distribution of all other parameters µ 2 , ..., µ n can be found by estimating an integration over the unknown value of µ 1 : The change point for random process Y is being detected by the likelihood ratio test (LRT).The LRT begins with a comparison of the likelihood scores of the two models; one is null model and other is alternative model.The test is based on the likelihood ratio, which states how many times more likely the data are under one model than the other.This likelihood ratio compared to a critical value used to decide whether to reject the null model.
The likelihood ratio test for change point k given the random variable Y and parameters before and after change points is as follows:

. Multiple Change Points Detection
After detecting first change point k, now the data can be broken into two distinct segments, one each side of the change point, 1 to k and k + 1 to n. Apply the same above mentioned procedure on each segment separately to detect multiple change points in the random process Y.

Convergence of the Parameters
Only one simulation run cannot signify the real features of the resulting model.That's why, the Gelman-Rubin Convergence diagnostic is being used for the estimation of steady-state parameters by running multiple number sequences of the chain.Lack of convergence can be detected by comparing multiple sequences but cannot be detected by looking at a single sequence.Therefore, multiple sequences of the chain are being run to estimate the actual characteristics of the target distribution, Gelman and Rubin [35][36][37].m replications of the simulation (m ≥ 10) are performed, each of length n = 1000.If the target distribution is unimodal, then Cowles and Carlin recommend performance of at least 10 chains [38].The mean pollutant concentration is a parameter of interest and is denoted by V.
Scalar summary V = Mean of the chain (average daily pollutant concentrations) Let V hj be the jth observation from the hth replication V hj = single observation for mean pollutant concentration per day where, replication number = h ∈ 1, 2, ..., m, observation number in a replication = j ∈ 1, 2, ..., n.
Mean of hth replication The between sequence variance represents the variance of a mean of m replications and is calculated as follows: Variance for all replications is calculated to determine the within-sequence variance The within-sequence variance is the mean variance for k replications determined as given below: Finally, the within-sequence variance and between-sequence variance are combined to get an overall estimate of the variance of V in the target distribution This factor √ R (estimated potential scale reduction) is the proportions among the upper and lower bounds on the standard deviation of V that are used to compute the factor and Var(V) could be reduced through a larger number of iterations.Further iterations of the chain must be performed if the potential scale reduction is high.Run the replications for all scalar summaries until R is lower than 1.1 or 1.2.

Flowchart Algorithm
The flowchart for change point (k) detection, for any random process Y, is given as follows: Start or initalize model Define time series or stochastic process Y and its probability distribution function Estimate prior hyperparameters and posterior hyperparameters for all parameters (µ 1 , µ 2 , ..., µ n ) Apply Bayes' theorem to determine the posterior probabilities of all parameters and hyperparameters A change point analysis was performed using a combination of CUSUM (cumulative sum control chart) and bootstrapping for comparative analysis.

The CUSUM Technique
The CUSUM is a sequential analysis technique typically used for monitoring change detection.CUSUM charts are constructed by calculating and plotting a cumulative sum based on the data.The cumulative sums are calculated as follows.
1. First calculate the average.ȳ = y 1 + y 2 + y 3 +, ..., y n n 2. Start the cumulative sum at zero by setting S 0 = 0, 3. Calculate the other cumulative sums by adding the difference between the current value and the average to the previous sum, that is, The cumulative sum is not the sum of the values but it is the cumulative sum of the differences in the values and averages.As the average is being deducted from each value, the final cumulative sum must be zero.Some practice is required to interpret a CUSUM chart.If, during a certain period of time, the overall average is less than the values.Then, the sum will steadily increase because the values being added to cumulative sum will be positive.An upward trend in the CUSUM chart shows a certain period of time, when overall average is less than the values.Similarly, a downward trend in the chart shows that overall average is above than the values.A rapid change in the trend of CUSUM specifies a shift or change in the average.Certain periods, when the CUSUM chart follows straight line, it indicates no change in average.

Bootstrap Analysis
Bootstrap analysis can be performed to determine confidence level for apparent change.The magnitude of change S di f f must be estimated before performing bootstrap analysis.

S i
Once the estimator of the magnitude of the change has been selected, the bootstrap analysis can be performed.A single bootstrap is performed as follows.
1. Generate a bootstrap sample of n units, denoted y 0 1 , y 0 2 , y 0 3 , ...y 0 n by randomly reordering the original n values.This is called sampling without replacement.2. Based on the bootstrap sample, calculate the bootstrap CUSUM, denoted S 0 0 , S 0 1 , S 0 2 , ...S 0 n .3. Calculate the maximum, minimum and difference of the bootstrap CUSUM, denoted S 0 max , S 0 min and S 0 di f f .4. Determine whether the bootstrap difference S 0 di f f is less than the original difference S di f f respectively.
The idea behind bootstrapping is that the bootstrap samples represent random reordering of the data that mimic the behavior of the CUSUM if no change has occurred.By performing a large number of bootstrap samples, the variance in S di f f if no change took place can be estimated.The value can be compared with the S di f f value calculated from the data in its original order to determine if this value is consistent with the expectation under zero change, if bootstrap CUSUM charts tend to stay closer to zero than the CUSUM of the data in its original order, a change likely occurred.A bootstrap analysis consists of performing a large number of bootstraps and counting the number of bootstraps for which S 0 di f f is less than S di f f .Let N be the number of bootstrap samples performed and let X be the number of bootstraps for which S 0 di f f < S di f f .Then, the confidence level that a change occurred as a percentage is calculated as follows: This is the solid proof to indicate that change does have occurred.Based on all possible reordering of the data, one would prefer to estimate the distribution of S 0 di f f instead of bootstrapping, which is not possible usually.That's why, for better estimation, number of bootstrap samples need to be increased.Bootstrapping is a distribution free methodology with only one supposition of an independent error structure.Change-point analysis and control charting, both are dependent on the mean-shift model.Let y 1 , y 2 , y 3 , ..., y n represent the data in time order.The mean-shift model can be written as where, µ i is the average at time i.Generally µ i = µ i−1 except for a small number of values of i called the change-points.i is the random error associated with the ith value and is assumed to be independent with a mean of zero.Once a change has been detected, an estimate of the time at which the change occurred can be made.One such estimator is the CUSUM estimator.Let m be such that Here, S m is the point furthest from zero in the CUSUM chart.The point m estimates the last point before the change occurred.The point m + 1 estimates the first point after the change.

Mean and Variance Estimation
Once a change has been detected, the data can be broken into two segments, one on each side of the change-point, 1 to m and m + 1 to n.Then, the two segments can be analyzed by determining their parameters.

Computational Experiment
Section 4.1 described the numerical verification of the formulated mathematical model to authenticate the validity of model.Real-time data of particulate matter daily concentrations for four different sites of Seoul, South Korea were utilized for this investigation as given in Section 4.2.

Toy Model for Validation with Known Solution
As shown in Figures 1-3, an artificial data set of random data is generated which consists of two segments with equal length of 50 data points.The samples are drawn from the Poisson distributions Poisson (5) and Poisson(2.5),respectively.Thus, change point occurs at 50th data point.In Table 2, the results for this artificial data set obtained through Probabilistic method have been described.As shown in Figures 4-11, the following results were acquired by applying the method explained in Section 3.4 to this artificial data set.
The distribution is symmetric about mean µ and σ 2 represents the variance.The numerical details for PM 2.5 and PM 10 concentrations are given in Tables 3 and 4, respectively.Here, the results were acquired by applying the method explained in Section 3.4 to the particulate matter (PM 2.5 and PM 10 ) concentrations for four different sites (Guro, Nowon, Songpa and Yongsan) in Seoul, South Korea.The daily data observed from January 2004 to December 2013 were used to compute the change point of both pollutants.However, the particulate matter (PM 2.5 and PM 10 ) concentrations are shown in Figures 12-19.A change point for a process with different data structures is identified to know that a change has occurred, the most likely period in which the change occurred and the parameter behavior before and after the change point.If particulate matter (PM 2.5 and PM 10 ) concentrations are Normally distributed and the change point for the random process is denoted by k, it is supposed that the data follow a Normal distribution with mean = µ and variance = σ 2 until the k point.After the k point, the data is Normally distributed with parameters mean (η) and variance (φ 2 ) and can be represented as Moreover, the notation " ∼ " means 'is distributed as.
), then the joint pdf (probability density function) is given by p(y 1 , y 2 , y 3 , ..., y k |µ, Expanding the quadratic term in the exponent, it can be seen that p(y 1 , y 2 , y 3 , ....., y k |µ, σ 2 ) depends on y 1 , y 2 , y 3 , ....., It can be shown that (∑ y i 2 , ∑ y i ) make up a two-dimensional sufficient statistic.Knowing the values of these quantities is equivalent to knowing the values of ȳ = ∑ y i k and s 2 = (y i − ȳ) 2 k−1 and so ( ȳ, s 2 ) are also sufficient statistic.
Inference for this two-parameter model can be broken down into two one-parameter problems.
Hence, the Bayesian model for parameter mean before change point µ can be given by p(y 1 , y 2 , y 3 , ...., The parameters θ 0 and k 0 can be interpreted as the mean and sample size, respectively, from a set of prior observations.
Similarly, the Bayesian model for the parameter mean after change point η can be given by p(y k+1 , y k+2 , y k+3 , ...., y n |η, For σ 2 , a family of prior distributions is required with support on (0, ∞).One such family of distributions is the Gamma family; unfortunately, this family is not conjugate for the Normal variance.However, the Gamma family does turn out to be a conjugate class of densities for 1 σ 2 (the precision).When using such a prior distribution, σ 2 has an Inverse-Gamma distribution.For interpret ability later, instead of using a 1 and b 1 , this prior distribution of σ 2 can be parameterized as: The prior parameters (σ 2 0 , ν 0 ) can be interpreted as the sample variance and sample size of prior observations, respectively, for posterior inference, the prior distributions and sampling model are as follows: 1 After a change point, the prior distributions and sampling model are

Improper Priors
Since k 0 and ν 0 are prior sample sizes, the smaller are these parameters, the more objective will be the estimates.The posterior distribution as k 0 and ν 0 get smaller and smaller; Improper priors has led to the following posterior distribution variance: 1 Similarly, the posterior distribution for mean is given as

Likelihood Ratio Test and Likelihood Function
As the expected value of a Normal distribution is the mean.The following likelihood ratio test needs to be applied for change point detection: The likelihood function for the expected value of a Normal distribution is determined as For the probabilistic method, MATLAB was used for change point detection of particulate matter (PM 2.5 and PM 10 ) data during the study period 2004-2013 for four different sites (Guro, Nowon, Songpa and Yongsan) in Seoul, South Korea.Ten replications of each simulation were performed with 1100 observations in each replication.The first 100 observations are discarded as a burn-in period.Replication of the mean V i of the remaining 1000 observations was performed for each replication, as shown in Tables 5 and 6.Mean (V) ofthe replication mean was used to get the converged values of parameters.In addition, the bootstraps analysis of CUSUM charts are shown in Figures 28-35         The value of k is uniform over y 1 ....., y n .

Results
Summarized forms of particulate matter (PM 2.5 and PM 10 ) change point (k), the parameters before a change point (mean = µ, variance = σ 2 ) and the parameters after a change point (mean = η, variance = φ 2 ) during the study period 2004-2013 for four different sites (Guro, Nowon, Songpa and Yongsan) in Seoul, South Korea are given in Tables 7-10, respectively.The results were computed using the numerical example of the mathematical model given in Section 4. At airkorea official website, the annual Particulate Matter trend in Seoul is being exhibited in Figure 36, which shows a decreasing trend during (2004-2013).These particulate matters concentrations are given in µg/m 3 [39].Where, (k) is the last point before change and (k + 1) be the first point after change point.So, the change point leis somewhere between (k) and (k + 1).This method also shows the reduction of PM 2.5 concentrations after change point as (µ) represents the mean concentration before change point and (η) be the pollutant concentrations after change point.The variance before change point (σ 2 ) and variance after change point (φ 2 ) have been determined through formulae σ 2 = ∑(X i −Mean) 2 n .Table 8.PM 2.5 Last point before change (k) and first point after change (k + 1) through the CUSUM approach (Normal distribution).9 explains the results obtained for PM 10 through Probabilistic method.Hence, the expected change point is (k) that differs for different areas.These results show the reduction of PM 10 concentrations after change point (k).While, (µ) be the PM 10 concentrations before change point (k) and (η) represents PM 10 concentrations after change point (k).The variance before change point (σ 2 ) and variance after change point (φ 2 ) have been determined through Inverse-Gamma distribution with conjugate hyper-parameters.The results obtained for PM 10 through CUSUM approach have been described in Table 10.Where, the last point before change is (k) and the first point after change point is (k + 1).Therefore, the change point leis anywhere between (k) and (k + 1).This method also depicts the reduction of PM 10 concentrations after change point.(µ) represents the PM 10 concentrations before change point and (η) be the PM 10 concentrations after change point.The variance before change point (σ 2 ) and variance after change point (φ 2 ) have been determined through formulae σ 2 = ∑(X i −Mean) 2 n .Table 10.PM 10 Last point before change (k) and first point after change (k + 1) through the CUSUM approach (Normal distribution).Guro is located in the southwestern part of Seoul, having an essential location as a transport link for railroads and land routes.The largest digital industrial complex in Korea is located in Guro.Thus, the policies of the Ministry of Environment in South Korea have decreased the particulate matters (PM 2.5 and PM 10 ) concentrations and occurrences of polluted days in Guro.

CUSUM Approach
CUSUM Approach also indicates a reduction in PM 2.5 and PM 10 concentrations from (µ) to (η) after change.As for Guro, Tables 8 and 10 depict the change of PM 2.5 and PM 10 concentrations through CUSUM approach respectively, change point for PM 2.5 concentrations lies in-between point 1570 (k) and 1571 (k + 1) and it occurred between point 1836 (k) and 1837 (k + 1) for PM 10 concentrations.

Nowon (Seoul, South Korea)
Nowon is positioned at the northeastern part of Seoul and has the highest population density in Seoul, with 619,509 persons living in 35.44 km 2 .The area is surrounded by mountains and forests on the northeast.The policies of the Ministry of Environment in Nowon have improved the particulate matter (PM 2.5 and PM 10 ) concentrations from (µ, σ 2 ) to (η, φ 2 ).Improvement in the reduction of pollutant concentrations varies which is more than Guro.

Probabilistic Method
For Nowon, Tables 7 and 9

CUSUM Approach
Moreover, CUSUM Approach also validates the reduction of PM concentrations and polluted days.In case of Nowon, Tables 8 and 10 depict the change of PM 2.5 and PM 10 concentrations through CUSUM approach respectively, change point for PM 2.5 concentrations lies in-between point 1474 (k) and 1475 (k + 1) and it occurred between point 1952 (k) and 1953 (k + 1) for PM 10 concentrations.

Songpa (Seoul, South Korea)
Songpa is situated at the southeastern part of Seoul, with the largest population of 647000 residents.As per Ministry of Environment policies in Songpa, there was a significant reduction for pollutant concentrations (µ, σ 2 ) to (η, φ 2 ).

CUSUM Approach
As per CUSUM Approach, there is a decrease in PM concentrations.Tables 8 and 10 depict the change of PM 2.5 and PM 10 concentrations through CUSUM approach respectively, change point for PM 2.5 concentrations lies in-between point 1455 (k) and 1456 (k + 1) and it occurred between point 1515 (k) and 1516 (k + 1) for PM 10 concentrations.

Yongsan (Seoul, South Korea)
Yongsan is the center of Seoul, in which almost 250,000 people reside.Prominent locations in Yongsan includes Yongsan station, an electronics market and Itaewon commercial area with heavy traffic and transportation.Consequently, the policies of the Ministry of Environment in Yongsan have affected the particulate matter (PM 2.5 and PM 10 ) concentrations, producing a remarkable decrease from (µ, σ 2 ) to (η, φ 2 ).

CUSUM Approach
The CUSUM Approach is directly applied on the raw data, which should be better for deterministic data structures.It also shows a reduction in pollutant concentrations.In case of Yongsan, Tables 8 and 10 depict the change of PM 2.5 and PM 10 concentrations through CUSUM approach respectively, change point for PM 2.5 concentrations lies in-between point 1738 (k) and 1739 (k + 1) and it occurred between point 1795 (k) and 1796 (k + 1) for PM 10 concentrations.

1.
This model presents a suitable technique for change point detection of diversely distributed data structures for all kind of stochastic processes.

2.
By detecting change points in different areas, such as climate change detection, human activity analysis and medical condition monitoring and also analyzing the parameters before and after change points, the results of legislation efforts can be understood and it can be determined whether these change points are favorable.

3.
A comparison of parameters before and after a change point evaluates the performance from previous status to current status, which can also be helpful for future prediction with the current strategies.4.
This study of change point detection also defines the current levels of an area under study, which is helpful for designing new policies for further improvements.

5.
This research provides guidance for defining new goals if previously defined goals have been achieved and indicates if the standards need to be revised to overcome upcoming challenges.

Conclusions
The key motivation of this research work was to explicate an appropriate change point detection model for diversely distributed data structures.This probabilistic method is being verified by the CUSUM approach and the results of the CUSUM approach are compared with the proposed method.This methodology is based on probability distributions and better to apply for random data structures and time series.But the CUSUM approach is directly applicable to the raw data, which is good for deterministic data structures.The model is applicable to various stochastic processes because different data structures follow different probability distributions.The parameter expectations before and after a change point were also estimated to measure the effectiveness and performance of policies applied.To verify the model, four major locations (Guro, Nowon, Songpa and Yongsan) in Seoul, South Korea were chosen as study areas considering their different characteristics, such as climate zone, environment, population and population density.The results were calculated and conclusions were drawn with the application of model on real-time data sets in all cases.The parameters before and after the change point of particulate matter concentrations indicated a reduction in pollutant concentrations over a 10-year period.At airkorea official website, the annual Particulate Matter trend in Seoul is being exhibited in Figure 36, which also shows a similar kind of decreasing trend during (2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013).The overall outcomes of this study indicate the effectiveness of policies applied to reduce the pollutant concentrations over time.Thus, further reduction in PM concentrations is required to achieve the set standards.This study can be further extended by locating change segments through multiple change points.

f
(Change point|Y, parameters before change point, parameters after change point = L Y; Change point, parameters before change point, parameters after change point ∑ n j=1 L Y; j, Change point, parameters before change point, parameters after change point 2, ..., n Define change point k and divide the process in two segments Define the parameters before (µ 1 , µ 2 , ..., µ n ) and after (η 1 , η 2 , ..., η n ) change point k Develop conjugate model for each parametr of probability distrtibution function f (y; µ 1 , µ 2 , ..., µ n ) ..., n Determine the likelihood function for change point detection Apply likelihood ratio test to detect the numerical value of change point k Estimate the converged values of all parameters by running multiple replications of the simuulation End 3.6.Comparison Method for Change Point Detection

Figure 2 .
Figure 2. Artificial Data set for Poisson Distribution before change point.

Figure 3 .
Figure 3. Artificial Data set for Poisson Distribution after change point.

Figure 4 .
Figure 4. Artificial Data set time series for Poisson Distribution.

Figure 9 .
Figure 9. Rate before change point density histogram.

Figure 11 .
Figure 11.Rate after change point density histogram.

4. 2 .
Particulate Matter (PM 2.5 and PM 10 ) Change Points for Four Different Sites Particulate matter (PM 2.5 and PM 10 ) concentrations are considered as Normally distributed.A random variable Y is understood as Normally distributed with mean µ and variance σ 2 > 0 if the probability distribution function at any given point y in the sample space is given as follows:

has been shown that variance before change point (σ 2 )
and variance after change point (φ 2 ) follow an Inverse-Gamma distribution.
Moreover, the CUSUM charts of particulate matter (PM 2.5 and PM 10 ) concentrations are shown in Figures 20-27 for the four different sites Guro, Nowon, Songpa and Yongsan in Seoul, South Korea.

5. 1 .
PM 2.5 Change Point (k) through Probabilistic Method In Table 7, the results obtained through Probabilistic method have been described.Where, (k) is the predicted change point varies for different areas.The results indicate the reduction of PM 2.5 concentrations after change point (k).While, (µ) represents the mean concentrations before change point (k) and (η) be the mean concentrations after change point (k).The variance before change point (σ 2 ) and variance after change point (φ 2 ) have been determined through Inverse-Gamma distribution with conjugate hyper-parameters.

Table 1 .
Previous studies on this topic.

Table 2 .
Change point (k) for artificial data set.

Table 5 .
PM 2.5 Converged values of parameters (Probabilistic Method)

Table 6 .
PM 10 Converged values of parameters (Probabilistic Method)