Estimating Distributions of Parameters in Nonlinear State Space Models with Replica Exchange Particle Marginal Metropolis–Hastings Method

Extracting latent nonlinear dynamics from observed time-series data is important for understanding a dynamic system against the background of the observed data. A state space model is a probabilistic graphical model for time-series data, which describes the probabilistic dependence between latent variables at subsequent times and between latent variables and observations. Since, in many situations, the values of the parameters in the state space model are unknown, estimating the parameters from observations is an important task. The particle marginal Metropolis–Hastings (PMMH) method is a method for estimating the marginal posterior distribution of parameters obtained by marginalization over the distribution of latent variables in the state space model. Although, in principle, we can estimate the marginal posterior distribution of parameters by iterating this method infinitely, the estimated result depends on the initial values for a finite number of times in practice. In this paper, we propose a replica exchange particle marginal Metropolis–Hastings (REPMMH) method as a method to improve this problem by combining the PMMH method with the replica exchange method. By using the proposed method, we simultaneously realize a global search at a high temperature and a local fine search at a low temperature. We evaluate the proposed method using simulated data obtained from the Izhikevich neuron model and Lévy-driven stochastic volatility model, and we show that the proposed REPMMH method improves the problem of the initial value dependence in the PMMH method, and realizes efficient sampling of parameters in the state space models compared with existing methods.


Introduction
Extracting latent nonlinear dynamics from observed time-series data is important for understanding the dynamic system against the background of the observed data. A state space model is a probabilistic graphical model for time-series data that assumes the existence of latent variables that cannot be observed directly . State space models are used in various fields to forecast observation values [7,15,22] and to estimate latent variables [11,20,21,26]. In many cases, however, model parameters are unknown. Therefore, estimating the model parameters from observations is an important task for the state space models.
To estimate the parameters in the state space models from observations, a method combining the sequential Monte Carlo (SMC) method [2,4,[8][9][10][11][12]14,[16][17][18][19][20][23][24][25]27] with the expectation-maximization (EM) algorithm [8,28,29] has been proposed [9][10][11]14,20]. This method is based on a maximum likelihood estimation framework, and it estimates parameters by sequentially updating the parameters so that the likelihood increases. Although it is guaranteed that a local optimum can be estimated by iteratively updating the parameters, the global optimum may not be estimated depending on the initial values of parameters. Furthermore, since the EM algorithm is a point estimation method, it is not possible to identify whether converged values are local or global optima.
To estimate the distribution of parameters in a state space model, two kinds of particle Markov chain Monte Carlo (PMCMC) methods have been proposed: the particle Gibbs (PG) method and the particle marginal Metropolis-Hastings (PMMH) method [12]. Both methods combine Markov chain Monte Carlo (MCMC) methods with the SMC method, and the distribution of parameters is estimated by collecting samples. The PG method combines the SMC method with Gibbs sampling [8,30], and the PG method samples latent variables and parameters in a state space model from the joint posterior distribution of latent variables and parameters alternately. In the PG method, the SMC method is employed for sampling latent variables. The PMMH method, on the other hand, combines the SMC method with the Metropolis-Hastings (MH) algorithm [8,25,31,32], and the PMMH method samples parameters in a state space model from the marginal posterior distribution of parameters. In the PMMH method, the SMC method is employed for calculating the likelihood marginalized over the distribution of latent variables. Both the PG method and the PMMH method have been widely applied (for example, the PG method [33,34], the PMMH method [35][36][37]).
In recent years, some extended versions of PG methods have been proposed to improve the sampling efficiency and initial value dependence, such as the particle Gibbs with ancestor sampling (PGAS) method and the replica exchange particle Gibbs with ancestor sampling (REPGAS) method [16,18,19,24]. Thus far, however, the existing methods for such improvement have been proposed for only the PG method. Therefore, it is important to improve the PMMH method for the accurate estimation of parameters since the PMMH method may have the problem of the initial value dependence.
In this paper, we propose the replica exchange particle marginal Metropolis-Hastings (REPMMH) method, which combines the PMMH method with the replica exchange method [24,[38][39][40] in order to improve the problem of initial value dependence in the PMMH method. Combining the replica exchange method with the PMMH method makes it possible to estimate the parameters governing the dynamics for very complex and nonlinear time-series data. We first describe the state space models and explain the PMMH method as a conventional method. Then, after explaining our proposed method, we conduct experiments to compare the proposed method with the conventional methods, the PMMH method and the REPGAS method.

Methods
In this section, we propose our replica exchange particle marginal Metropolis-Hastings (REPMMH) method. First, we describe a state space model for time-series data using a probabilistic graphical model. Next, we describe the conventional particle marginal Metropolis-Hastings (PMMH) method to estimate the marginal posterior distribution of parameters obtained by marginalization over the distribution of latent variables in a state space model. After this, we propose the REPMMH method, which combines the PMMH method with the replica exchange method to improve the problem of initial value dependence in the PMMH method.

State Space Model
We show a state space model as a probabilistic graphical model in Figure 1. Note that there are two type of variables, latent variables z 1:N = {z 1 , z 2 , . . . , z N } and observations y 1:N = {y 1 , y 2 , . . . , y N }, at the time steps {1, 2, . . . , N} in the state space model. The latent variables z 1:N cannot be observed directly and only the observations y 1:N are observable. At a time step n, the state space model is represented as follows: where f z n z n−1 , θ f and g y n z n , θ g are called a system model and an observation model, respectively. θ f are the parameters of the system model, and θ g are the parameters of the observation model. The system model f z n z n−1 , θ f represents the process of updating the latent variables z n from the previous latent variables z n−1 . Moreover, the observation model g y n z n , θ g represents the process of obtaining observations y n from the latent variables z n . The arrow to the latent variables z n at the time step n from the latent variables z n−1 at the previous time step n − 1 represents a system model f z n z n−1 , θ f , and the arrow to the observations y n at the time step n from the latent variables z n at the time step n represents an observation model g y n z n , θ g . Θ = θ f , θ g are parameters to be estimated.
The goal of this paper is to estimate the posterior distribution of the parameters p(Θ | y 1:N ) for the given observations y 1:N , where Θ is represented as Θ = θ f , θ g . However, since the latent variables exist in the state space models, it is necessary to perform marginalization with respect to the latent variables in order to obtain the marginal posterior distribution p(Θ | y 1:N ). Because it is often difficult to calculate the marginal posterior distribution p(Θ | y 1:N ) analytically, we propose a new method for estimating the marginal posterior distribution p(Θ | y 1:N ) based on the PMMH method in this paper.
In the PMMH method, the marginal likelihood p(y 1:N | Θ) is used to evaluate the appropriateness of parameters Θ. Here, the SMC method is used to calculate the marginal likelihood p(y 1:N | Θ) of the parameters Θ obtained by marginalization over the distribution of latent variables z 1:N . A new sample candidate of parameters Θ * = θ * f , θ * g is proposed from an arbitrary proposal distribution q(Θ | Θ[k − 1]) given the sample one step before Θ[k − 1], where k is the sample number. Moreover, whether to accept or reject the sample candidate Θ * is determined based on the following acceptance probability: where p(Θ) represents the prior distribution of parameters. p(y 1:N | Θ) is the marginal likelihood obtained by marginalization over the distributions of latent variables z 1:N as follows: where p(z 1 ) is the distribution of latent variables z 1 at time step 1. Since it is difficult to obtain the marginal likelihood p(y 1:N | Θ) analytically, the SMC method is used in the PMMH method to calculate the marginal likelihood p(y 1:N | Θ) numerically. The SMC method estimates the distribution of latent variables by approximating the distribution with the density of the particles z (1) 1:N , z (2) 1:N , . . . , z (M) 1:N as follows: where z (m) 1:N is the m-th particle and M is the number of particles. δ(z 1:N ) is the Dirac delta distribution.
To obtain particles z (1) n , z Moreover, the obtained particles z (1) n , z By iterating the above flow for time step n ∈ {1, 2, . . . , N}, particles that approximate the distribution of latent variables z 1:N can be obtained. Here, the marginal likelihood p(y 1:N | Θ) can be calculated approximately as follows: By calculating the acceptance probability p accept in Equation (3) with the marginal likelihood p(y 1:N | Θ * ) for the sample candidate Θ * obtained by Equation (9), it is determined whether to accept or reject the proposed sample candidate Θ * . We show the flow of the PMMH method described above in Algorithm 1. draw the initial particles z (m) 1 ∼ p(z 1 ) for m = 1, . . . , M (m is the particle number of the particle that is the source of resampling) 5: calculate the weights of particles w for n = 2, . . . , N do 9: draw the particles z (1) n , z with Equation (7) 12: resample the particles z (1) n , z end for 14: calculate the marginal likelihood p(y 1:N | Θ * ) with Equation (9) 15: calculate the acceptance probability p accept with Equation (3) 16: draw a uniform random number α ∼ U (0, 1) (U (a, b) is a uniform distribution with range [a, b)) 17: if α ≤ p accept then 18

Proposed Method
In our study, we propose the REPMMH method, which combines the PMMH method with the replica exchange method [24,[38][39][40] to improve the problem of initial value dependence in the PMMH method. By employing the REPMMH method, we estimate the marginal posterior distribution of parameters from the time-series observations.

Brief Summary of Our Proposed Method
We show the schematic diagram of the REPMMH method in Figure 2. In our proposed REPMMH method, we introduce multiple different replicas of parameters {Θ} = Θ (1) , Θ (2) , . . . , Θ (r) , . . . , Θ (R) at temperatures T = T (1) , T (2) , . . . , T (r) , . . . , T (R) into the PMMH method. As shown in the middle part of Figure 2, we employ the PMMH method in parallel at each temperature. In the PMMH method at each temperature T (r) , we obtain the respective marginal likelihood p y 1:N Θ (r) * At this time, the target distribution becomes smooth as the temperature becomes high. As a result, it becomes easier to obtain samples from a wide range. Furthermore, exchanges between the samples at different temperatures are conducted in order to realize the transitions that are difficult depending on the initial values by the conventional PMMH method.

Metropolis-Hastings algorithm
High temperature

Observation data
Posterior distribution

Proposed parameter
Accept / Reject

Marginal likelihood
Proposed parameter . By the SMC method, the marginalization over time-series of latent variables z 1:N is conducted iteratively for time steps n = 1, 2, . . . , N. In the MH algorithm, the marginal likelihood p y 1:N Θ (r) * 1 T (r) is used to determine whether to accept or reject the sample candidate. In the REPMMH method, exchanges between samples at different temperatures are considered in order to achieve the transitions that are difficult to achieve with the particle marginal Metropolis-Hastings (PMMH) method. The transitions can be realized by passing through a high temperature due to exchange between temperatures, as shown by the red arrows in the MH algorithm. (d) The estimated posterior distributions of parameters Θ as the output.

Introducing the Replica Exchange Method into the PMMH Method
Here, we propose the REPMMH method to accurately estimate the distribution of parameters from observed data y 1:N . In our proposed method, we introduce replicas of param- , . . . , T (r) , . . . , T (R) and consider the extended joint marginal posterior distribution as follows: where π T (r) Θ (r) y 1:N expresses the marginal posterior distribution at temperature T (r) , which is expressed by using the original marginal posterior distribution p Θ (r) y 1:N of the parameter Θ (r) at the temperature T (1) = 1.0 as follows: where z T (r) is a partition function. Note that, as expressed in Equation (11), at sufficiently high temperatures, the distribution of parameters becomes closer to a uniform distribution, independent of the values of y 1:N . The distribution with T (1) = 1.0 corresponds to the original marginal posterior distribution p(Θ | y 1:N ) to be investigated.
The marginal posterior distribution at each temperature p Θ (r) y 1:N is obtained using Bayes' theorem as follows: Namely, the marginal posterior distribution p Θ (r) y 1:N is proportional to a product of a marginal likelihood p y 1:N Θ (r) and a prior distribution p Θ (r) of parameters Θ (r) .
To obtain the marginal likelihood p y 1:N Θ (r) , marginalization of the joint distribution at each temperature should be conducted as follows: where z (r) 1:N are the latent variables at the temperature T (r) . By performing the SMC method for all the time steps at each temperature, the marginalization is conducted numerically.
As shown in Figure 2b,c, in the proposed method, the SMC method and the MH algorithm are conducted for each temperature. In the SMC method, the marginal likelihood of parameters p y 1:N Θ (r) * is determined by the numerical marginalization using the candidate of parameters Θ (r) * proposed in the MH algorithms.
In the MH algorithm, the candidate of parameters Θ (r) * is determined to be accepted or rejected at each temperature T (r) with the marginal posterior π T (r) Θ (r) * y 1:N ( Figure 2b). Here, the acceptance probability p (r) accept at each temperature is calculated as follows: Moreover, we exchange samples between different temperatures T (r) and T (r+1) according to the exchange probability as follows: where {Θ} * is expressed as follows: Note that the exchange probability p EX {Θ}, {Θ} * corresponds to the Metropolis criterion for proposing to exchange the samples between different temperatures T (r) and T (r+1) . By deciding whether to accept or reject the proposed samples {Θ} * with the Metropolis criterion of Equation (15), the transition probability W {Θ} → {Θ} * for the exchange process satisfies the detailed balance condition as follows: where q {Θ} * |{Θ} is the proposed probability for {Θ} * and the proposed probability of the exchange process is symmetric q {Θ} * |{Θ} = q {Θ}|{Θ} * . Thus, since the exchange process in the proposed method satisfies the detailed balance condition, the proposed method can sample from the distribution π EX ({Θ} | y 1:N ). By this exchange process, the REPMMH method makes it possible to improve the problem of initial value dependence in the PMMH method. The sampled distributions of the replica π T (r) Θ (r) y 1:N at higher temperatures become closer to a uniform distribution ideally as follows: Therefore, in practice, it becomes possible to escape from local optima at sufficiently high temperatures ( Figure 2b). Moreover, the samples may not stay in one local optimum since each replica is exchanged between the high temperature and low temperature repeatedly, and we can sample the parameters efficiently. We show the flow of our REPMMH method described above in Algorithm 2.

Relations among Particle Markov Chain Monte Carlo Methods
We briefly summarize the differences among the conventional particle Markov chain Monte Carlo (PMCMC) methods and our proposed REPMMH method that can estimate parameters of a state space model in Table 1. The particle Gibbs (PG) method is another PMCMC method, and it samples latent variables and parameters in a state space model alternately by using Gibbs sampling [8,30]. The PMMH method combines the SMC method with the MH algorithm, whereas the PG method combines the SMC method with Gibbs sampling. While the SMC method is employed to calculate the marginal likelihood p(y 1:N | Θ) of parameters Θ in the PMMH method, the SMC method is employed to obtain samples of latent variables z 1:N in the PG method [12]. The PMMH method directly targets the marginal posterior distribution p(Θ | y 1:N ), whereas the PG method targets the joint posterior distribution p(z 1:N , Θ | y 1:N ) [12]. Note that the SMC method used in the PG method is called the conditional SMC method and uses the previous sample of latent variables z 1:N [k − 1] as a particle in the SMC method [12]. Furthermore, advanced versions of the PG method have been proposed, such as the particle Gibbs with ancestor sampling (PGAS) method for improving sampling efficiency and the replica exchange particle Gibbs with ancestor sampling (REPGAS) method to improve the initial value dependence [16,18,19,24]. Samples obtained by employing the PMMH method also have a problem of initial value dependence, similar to those obtained by employing the PG method, and it is considered that combining the PMMH method with the replica exchange method would be effective. for r = 1, . . . , R do 4: draw the sample candidate of parameters Θ (r) * ∼ q Θ (r) * | Θ (r) [k − 1] 5: calculate the marginal likelihood p y 1:N Θ (r) * by using the SMC method according to Equation (13) 6: calculate the acceptance probability p (r) accept with Equation (14) 7: draw a uniform random number α ∼ U (0, 1) (U (a, b) is a uniform distribution with range [a, b)) 8: if α ≤ p end for 14: choose the replica number r EX ← 1 or r EX ← 2 for the exchange 15: repeat 16: calculate exchange probability p EX {Θ}, {Θ} * with Equation (15) for replica numbers r EX and r EX + 1 17: draw a uniform random number α EX ∼ U (0, 1) 18: if α EX ≤ p EX {Θ}, {Θ} * then 19: 20: end if 21: set the replica number r EX ← r EX + 2 for the exchange 22: until r EX ≤ R − 1 23: end for 24 Sample parameters Θ and latent variables z 1:N alternately with Gibbs sampling for targeting the joint posterior distribution p(z 1:N , Θ | y 1:N ). Note that the SMC method is used for sampling latent variables z 1:N . The SMC method used in the PG method is called the conditional SMC method and uses the previous sample of latent variables z 1:N [k − 1] as a particle in the SMC method [12].
PGAS p(z 1:N , Θ | y 1:N ) Sample latent variables z 1:N not only in the forward direction but also in the backward direction in the PG method [16,18,19]. Improve the problem of initial value dependence in the PGAS method by combining the replica exchange method and the PGAS method [24].
PMMH p(Θ | y 1:N ) Sample parameters Θ with the MH algorithm for targeting directly the marginal posterior distribution p(Θ | y 1:N ) obtained by marginalization over the distribution of latent variables z 1:N . Note that the SMC method is used to calculate the marginal likelihood p(y 1:N | Θ) [12].
REPMMH p(Θ | y 1:N ) Improve the problem of initial value dependence in the PMMH method by combining the replica exchange method and the PMMH method.

Results
In this section, we show that by employing our proposed replica exchange particle marginal Metropolis-Hastings (REPMMH) method for the Izhikevich neuron model [41,42] and Lévy-driven stochastic volatility model [12,43,44], the marginal posterior distribution of parameters p(Θ | y 1:N ) can be estimated from observations y 1:N , and we verify whether the REPMMH method can overcome the problem of initial value dependence in the particle marginal Metropolis-Hastings (PMMH) method. Moreover, we compare the sampling efficiency of the REPMMH method with that of the conventional methods, the PMMH method and the replica exchange particle Gibbs with ancestor sampling (REPGAS) method.

Izhikevich Neuron Model
To verify the effectiveness of our proposed method, we use the Izhikevich neuron model. The Izhikevich neuron model is a computational model for the membrane potential of a neuron [41,42]: where v is the membrane potential and u is the membrane recovery variable. I ext is the input current from outside the neuron, and a and b are parameters in the Izhikevich neuron model that represent the characteristic of the neuron. In Equations (19) and (20), we consider additive white Gaussian noise terms ξ v (t) and is the Dirac delta function). Here, standard deviations of the membrane potential and membrane recovery variable are expressed by σ v and σ u , respectively. If the membrane potential v exceeds the threshold value v th = 30, the membrane potential v and the membrane recovery variable u are reset to c and u + d, respectively, as follows: where c and d are parameters representing the characteristic of the neuron.
Here, we assume that the observations y 1:N are the membrane potentials with Gaussian observation noise, and we estimate the parameters Θ = {a, b, c, d} from only the observations y 1:N . We use the true parameters Θ = {a, b, c, d} = {0.02, 0.2, −65, 6} and the number of data N = 5.0 × 10 2 to generate data. In the system model, the means and the variances of the Gaussian noise are µ v , σ 2 v = {0, 0.25} and µ u , σ 2 u = 0, 10 −4 . In the observation model, the mean and the variance of the Gaussian noise are µ y , σ 2 y = {0, 1}. We show the generated data from the Izhikevich neuron model in Figure 3. In Figure 3, complex spike activities with different inter-spike intervals and different peaks are seen in response to external inputs. We assume that only one-dimensional time-series of observed data y n and external inputs can be used, while the latent dynamics are governed by two-dimensional nonlinear dynamical systems with four parameters Θ = {a, b, c, d}. We employ our REPMMH method and the conventional methods, the PMMH method and the REPGAS method, for the generated data in Figure 3      We show in Figure 5 the estimated posterior distribution of parameters p(Θ | y 1:N ) obtained by employing the REPMMH method. From Figure 5, we find that the maximum values of the estimated posterior distribution of parameters are located around the true values (a = 0.020, b = 0.20, c = −65, d = 6.0), even though the initial values of the parameters (a = 0.025, b = 0.15, c = −60, d = 5.5) are set to be far from the true values. This improvement in estimation accuracy would be induced by combination with the replica exchange method. It is easier to obtain samples from a wider range since the replica exchange method allows samples to pass through high temperatures. From these results, we find that the problem of initial value dependence in the PMMH method is improved by employing our proposed method. In order to investigate the efficiency of sampling parameters in the proposed method and existing methods, we show in Figure 7 the autocorrelation function results calculated using the samples of the PMMH method, the REPGAS method and our proposed REPMMH method. In all the parameters a, b, c and d, the decay of the autocorrelation in the REPMMH samples is faster using the REPMMH samples than those calculated using the PMMH samples and the REPGAS samples. The time constant of the autocorrelation function has a strong influence on the convergence time of the PMCMC method. The time constants of the autocorrelation functions for the REPMMH samples are around 20 for all parameters a, b, c and d, while those of the autocorrelation functions for the PMMH samples are more than 10 5 , as shown in Figure 7. Since the computational cost of the exchange process in the REPMMH method is very small compared to the computational cost of the SMC method, the computational cost of the REPMMH method is approximately R = 64 times the computational cost of the PMMH method. Nevertheless, the REPMMH method drastically improves the sampling efficiency compared to the increase in the computational cost; the REPMMH method is R = 64 times more computationally expensive than the PMMH method, while the effective sample size of the REPMMH method is much larger (around 10 3 times larger) than that of the PMMH method.
When the same number of temperatures and particles is used, the REPGAS method is more computationally expensive than the REPMMH method since the REPGAS method requires the sampling of the latent variables z 1:N and the ancestor sampling, which considers sampling of the latent variables z 1:N not only in the forward direction but also in the backward direction in the conditional SMC method. Nevertheless, the REPMMH method has high sampling efficiency compared to the REPGAS method. Thus, we find that the sampling efficiency of our proposed REPMMH method is higher than that of the conventional methods.  Table 2. Table 2 shows the mode values of the estimated distributions, the standard deviations (Std) of the estimated distributions and the values of autocorrelation functions (ACF) with the lag length 30 for the numbers of temperatures R = 1, 4, 16 and 64. Note that the numbers of particles M are 50 in all cases and the maximum value of temperature is fixed at T (R) = 1.1 63 for R > 1.
As mentioned above, for the number of temperatures R = 1, we can estimate the parameter d around the true value d = 6.0, while the other three parameters a, b and c remain at their initial values (a = 0.025, b = 0.15, c = −60). We also find that the samples of parameters a, b and c cannot move enough from their initial values since the values of the standard deviations are very small. For the number of temperatures R = 4, we find the samples can escape the local optima and we can estimate the true values of all parameters a, b, c and d accurately due to the high temperatures that allow escape from the local optima. However, since the values of the autocorrelation functions are close to 1.0, we need a large number of samples in order to estimate the shape of the distribution p(Θ | y 1:N ). On the other hand, we find that the values of the autocorrelation functions are smaller for R = 16 and 64.

Lévy-Driven Stochastic Volatility Model
Next, we also verify the effectiveness of the proposed method using the Lévy-driven stochastic volatility model [12,43,44]. In this model, the dynamics of the logarithm of asset price y * (t) are represented by the following differential equation: where µ is the drift parameter and β is the risk premium. B(t) is the Brownian motion and σ 2 (t) represents the volatility. The dynamics of the volatility σ 2 (t) are modeled by the following Lévy-driven Ornstein-Unlenbeck process: where λ is a positive constant and z(t) is a non-Gaussian Lévy process with positive increments. The observation at the time step n, y n , in this model is obtained by the following Gaussian distribution: where ∆ is the length of the time interval.
The stochastic volatility models are numerically investigated by using discretized dynamical models [12,43,44], and the estimation algorithm for the parameters of the stochastic volatility models has been investigated using such the discretized ones [12]. The integrated volatility σ 2 n at the time step n is calculated as follows: where σ 2 (n∆) and z(λn∆) are, respectively, represented as follows: z(λn∆) = z{λ(n − 1)∆} + η z,n .
In this paper, we employ the proposed method and the PMMH method for the stochastic volatility model. The PMMH-based methods, including the proposed method, can be applied to complex models such as the Lévy-driven stochastic volatility model, as long as the probability density of the observation model can be calculated. On the other hand, the PG-based method is difficult to apply to the stochastic volatility model due to the need to calculate the probability density of the system model [12]. Following [12], we use the true parameters Θ = {κ, δ, γ, λ} = {0.5, 1.41, 2.83, 0.1}, the number of data N = 4.0 × 10 2 and the time interval of length ∆ = 1.0 to generate data. In order to estimate the parameters Θ, we use the initial values of the parameters Θ[0] = {0. 25, 7.41, 9.83, 1.5}, the number of samples K = 1.5 × 10 5 , the number of burn-in samples K burn−in = 10 5 and the number of particles M = 200 in both the proposed REPMMH method and the PMMH method. In the REPMMH method, the number of temperatures R is 64.
We show in Figure 8 the estimated posterior distribution of parameters p(Θ | y 1:N ) obtained by employing the PMMH method. From Figure 8, we find that a peak of the estimated posterior distribution of parameter λ, p(λ | y 1:N ), is located around its true value λ = 0.1. However, the maximum values of the estimated posterior distribution of the other three parameters, κ, δ and γ, are far from their true values (κ = 0.5, δ = 1.41, γ = 2.83). Thus, the joint posterior distribution of the four parameters is found to be not adequately estimated. It is considered that the target distribution is not reached with a small number of samples since the sampling efficiency of the PMMH method is low. We show in Figure 9 the estimated posterior distributions of parameters p(Θ | y 1:N ) obtained by employing the REPMMH method. From Figure 9, we find that the true values of parameters Θ = {κ, δ, γ, λ} = {0.5, 1.41, 2.83, 0.1} are estimated appropriately by using the same number of samples and the same initial values Θ[0] = {0. 25, 7.41, 9.83, 1.5} as the PMMH method. The results in Figures 8 and 9 show that our REPMMH method has higher sampling efficiency than the PMMH method.
Moreover, we show in Figure 10 the autocorrelation function results calculated using the samples of the PMMH method and the REPMMH method. In all parameters κ, δ, γ and λ, the decay of the autocorrelation is faster using the REPMMH samples than that using the PMMH samples. As shown in Figure 10, the time constants of the autocorrelation functions for the REPMMH samples are less than 15 for all parameters κ, δ, γ and λ, while the time constant of the autocorrelation functions for the PMMH samples for the parameter γ is more than 3.0 × 10 3 . As mentioned above, since the computational cost of the REPMMH method is approximately R = 64 times the computational cost of the PMMH method, the REPMMH method improves the sampling efficiency compared to the increase in the computational cost.

Concluding Remarks
In this paper, we have proposed the replica exchange particle marginal Metropolis-Hastings (REPMMH) method in order to estimate the marginal posterior distribution of parameters p(Θ | y 1:N ) of the state space model. Our proposed method can be applied to complex models such as the Lévy-driven stochastic volatility model, even if the probability densities of the system models cannot be calculated explicitly. By the proposed method, we introduce the exchange between samples of model parameters Θ at different temperatures and realize an efficient sampling method for model parameters governing the nonlinear dynamical systems.
Using nonlinear dynamical models such as the Izhikevich neuron model and Lévydriven stochastic volatility model, we show that our proposed REPMMH method can improve the problem of initial value dependence of the particle marginal Metropolis-Hastings (PMMH) method. The results have shown that the proposed REPMMH method accurately estimates the marginal posterior distribution of parameters. Moreover, by comparing the autocorrelation functions of the obtained samples, it has been also shown that our proposed REPMMH method can sample more efficiently than the conventional methods. In the replica exchange particle Gibbs with ancestor sampling (REPGAS) method, the next sample of latent variables is obtained under the strong influence of the current sample of latent variables. On the other hand, in the REPMMH method, the correlation of the latent variables between the current and next steps is low since the REPMMH method only calculates the marginal likelihood of the next step, regardless of the latent variables obtained in the current step. Therefore, it is considered that the REPMMH method can sample parameters more efficiently than the REPGAS method.
In this paper, although we conducted the experiments by using two specific dynamical models: the Izhikevich neuron model and the Lévy-driven stochastic volatility model, the proposed REPMMH method can be applied to various dynamical systems described by ordinary or partial differential equations. Although the proposed method can be applied to any ordinary or partial differential equations that can be represented as state space models, applications of the proposed method are difficult when the system models for the dynamical systems or the observation models are completely unknown. In such cases, we consider that combining the proposed method with non-parametric Bayesian methods is effective. We leave this for future work.

Conflicts of Interest:
The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Abbreviations
The following abbreviations are used in this manuscript: