# Optimization of the Mixture Transition Distribution Model Using the March Package for R

^{1}

^{2}

^{*}

^{†}

## Abstract

**:**

## 1. Introduction

## 2. MTD Model and Its Optimization

## 3. Main Features of the march Package

- Two members of the current population are randomly selected, and their probabilities of selection are proportional to their fitness values.
- A random crossover occurs between the two selected members of the population with a probability that can be selected by the user (default: 50%). This operation leads to two children solutions.
- A mutation is applied to the two children by adding a Gaussian distributed noise to each parameter with a probability that can be chosen by the user (default: 5%).
- The two children are improved using the GEM algorithm, with a number of iterations that can be chosen by the user (default: 2).
- The fitness of each children is computed.

- The EA algorithm is used to identify a candidate solution in the attraction basin of the global optimum.
- Starting from this candidate solution, the GEM algorithm is used to optimize the model until reaching the global optimum.

## 4. Combining Estimation Algorithms for the MTD Model

- Optimize the MTD model of interest using the HC algorithm with standard parameter values. If the model is sufficiently simple and if it is likely to have only one global optimum and no local ones, then this algorithm will find the optimum value very quickly. Otherwise, it could provide a good idea for the structure of the solution.
- Use the EA algorithm to perform a more complete search of the solution space to identify possible solutions that could have been missed by the HC algorithm because they do not belong to the same basin of attraction.
- Optimize the best solution fond by the EA algorithm using the HC algorithm either within the GEM algorithm or independently of it.

## 5. Example

`set.seed(1234)`

`Model.1 <- march.mtd.construct(Employment.2,order=2, MCovar=c(1,1), init="best",`

`mtdg=T, llStop=0.0001, maxIter = 1000, maxOrder=2)`

`print(Model.1)`

`march.BIC(Model.1)`

`set.seed(1234)`

`Model.2 <- march.dcmm.construct(Employment.2, orderHC=1, orderVC=2, M=1,`

`gen=1, popSize=1, iterBw=100, stopBw = 0.01,`

`CMCovar=c(1,1), Cmodel="mtdg", maxOrder=2)`

`print(Model.2)`

`march.BIC(Model.2)`

`set.seed(1234)`

`Model.3 <- march.dcmm.construct(Employment.2, orderHC=1, orderVC=2, M=1,`

`gen=20, popSize=20, pMut=0.05, pCross=0.5, iterBw=0,`

`CMCovar=c(1,1), Cmodel="mtdg", maxOrder=2)`

`print(Model.3)`

`march.BIC(Model.3)`

`set.seed(1234)`

`Model.3a <- march.mtd.construct(Employment.2, order=2, MCovar=c(1,1),`

`mtdg=T, llStop=0.0001, maxIter = 1000, maxOrder=2,`

`seedModel=Model.3)`

`print(Model.3a)`

`march.BIC(Model.3a)`

`set.seed(1234)`

`Model.3b <- march.dcmm.construct(Employment.2, orderHC=1, orderVC=2, M=1,`

`gen=1, popSize=1, iterBw=100, stopBw = 0.01,`

`CMCovar=c(1,1), Cmodel="mtdg", maxOrder=2,`

`seedModel = Model.3)`

`print(Model.3b)`

`march.BIC(Model.3b)`

`set.seed(1234)`

`Model.4 <- march.dcmm.construct(Employment.2, orderHC=1, orderVC=2, M=1,`

`gen=20, popSize=20, pMut=0.05, pCross=0.5, iterBw=2,`

`CMCovar=c(1,1), Cmodel="mtdg", maxOrder=2)`

`print(Model.4)`

`march.BIC(Model.4)`

## 6. Conclusions

## Author Contributions

## Funding

## Acknowledgments

## Conflicts of Interest

## Appendix A. R Code for the Computation of Other Markovian Models

#### Appendix A.1. Independence Model

`Model.indep <- march.indep.construct(Employment.2, maxOrder = 2)`

`print(Model.indep)`

`march.BIC(Model.indep)`

#### Appendix A.2. Homogeneous Markov Chains

`Model.mc1 <- march.mc.construct(Employment.2, order = 1, maxOrder = 2)`

`print(Model.mc1)`

`march.BIC(Model.mc1)`

`Model.mc2 <- march.mc.construct(Employment.2, order = 2, maxOrder = 2)`

`print(Model.mc2)`

`march.BIC(Model.mc2)`

#### Appendix A.3. MTD Models without Covariates

`set.seed(1234)`

`Model.mtd2 <- march.mtd.construct(Employment.2, order=2, maxOrder=2,`

`llStop=0.0001, maxIter = 1000, init="best")`

`print(Model.mtd2)`

`march.BIC(Model.mtd2)`

`set.seed(1234)`

`Model.mtdg2 <- march.mtd.construct(Employment.2, order=2, mtdg = T, maxOrder=2,`

`llStop=0.0001, maxIter = 1000, init="best")`

`print(Model.mtdg2)`

`march.BIC(Model.mtdg2)`

#### Appendix A.4. Hidden Markov Models

`set.seed(1234)`

`Model.hmm2 <- march.dcmm.construct(Employment.2, orderHC=1, orderVC=0, M=2, gen=1,`

`popSize=1, iterBw=100, stopBw = 0.01, maxOrder=2)`

`print(Model.hmm2)`

`march.BIC(Model.hmm2)`

`set.seed(1234)`

`Model.hmm3 <- march.dcmm.construct(Employment.2, orderHC=1, orderVC=0, M=3, gen=1,`

`popSize=1, iterBw=100, stopBw = 0.01, maxOrder=2)`

`print(Model.hmm3)`

`march.BIC(Model.hmm3)`

#### Appendix A.5. Double Chain Markov Models

`set.seed(1234)`

`Model.dcmm2 <- march.dcmm.construct(Employment.2, orderHC=1, orderVC=1, M=2, gen=1,`

`popSize=1, iterBw=100, stopBw = 0.01,`

`maxOrder=2)`

`print(Model.dcmm2)`

`march.BIC(Model.dcmm2)`

## References

- Cox, N.J. MARKOV: Stata Module to Generate Markov Probabilities. Statistical Software Components, Boston College Department of Economics. 1998. Available online: https://ideas.repec.org/c/boc/bocode/s336002.html (accessed on 14 October 2020).
- Saint-Cyr, L.D.F.; Piet, L. mixmcm: A community-contributed command for fitting mixtures of Markov chain models using maximum likelihood and the EM algorithm. Stata J.
**2019**, 19, 294–334. [Google Scholar] - Paes, A.T.; de Lima, A.C.P. A SAS macro for estimating transition probabilities in semiparametric models for recurrent events. Comput. Methods Programs Biomed.
**2004**, 75, 59–65. [Google Scholar] [CrossRef] [PubMed] - Hui-Min, W.; Ming-Fang, Y.; Chen, T.H. SAS macro program for non-homogeneous Markov process in modeling multi-state disease progression. Comput. Methods Programs Biomed.
**2004**, 75, 95–105. [Google Scholar] [CrossRef] [PubMed] - Vermunt, J.K.; Magidson, J. Upgrade Manual for Latent GOLD 5.1; Statistical Innovations Inc.: Belmont, MA, USA, 2016. [Google Scholar]
- Heiner, M.; Kottas, A. Estimation and selection for high-order Markov chains using Bayesian mixture transition distribution models. arXiv
**2019**, arXiv:1906.10781v1. [Google Scholar] - R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2019; Available online: https://www.R-project.org/ (accessed on 14 October 2020).
- Visser, I.; Speekenbrink, M. depmixS4: An R Package for Hidden Markov Models. J. Stat. Softw.
**2010**, 36, 1–21. [Google Scholar] [CrossRef] [Green Version] - Spedicato, G.A. Discrete Time Markov Chains with R. R J. 2017. R package version 0.6.9.7. Available online: https://journal.r-project.org/archive/2017/RJ-2017-036/index.html (accessed on 14 October 2020).
- Raftery, A.E. A model for high-order Markov chains. J. R. Stat. Soc. B
**1985**, 47, 528–539. [Google Scholar] [CrossRef] - Lèbre, S.; Bourguignon, P. An EM algorithm for estimation in the mixture transition distribution model. J. Stat. Comput. Simul.
**2008**, 78, 713–729. [Google Scholar] [CrossRef] [Green Version] - Berchtold, A.; Berchtold, A. March for Windows, v. 3.10. Available online: https://andreberchtold.com/march.html (accessed on 14 October 2020).
- Berchtold, A. Estimation in the mixture transition distribution model. J. Time Ser. Anal.
**2001**, 22, 379–397. [Google Scholar] [CrossRef] [Green Version] - Bolano, D. Handling Covariates in Markovian Models with a Mixture Transition Distribution Based Approach. Symmetry
**2020**, 12, 558. [Google Scholar] [CrossRef] [Green Version] - Berchtold, A.; Raftery, A.E. The Mixture Transition Distribution Model for High-Order Markov Chains and Non-Gaussian Time Series. Stat. Sci.
**2002**, 17, 328–356. [Google Scholar] [CrossRef] - Berchtold, A. Optimization of Mixture Models: Comparison of Different Strategies. Comput. Stat.
**2004**, 19, 385–406. [Google Scholar] [CrossRef] - Celeux, G.; Govaert, G. A Classification EM Algorithm for Clustering and Two Stochastic Versions. Comput. Stat. Data Anal.
**1992**, 14, 315–332. [Google Scholar] [CrossRef] [Green Version] - Holland, J.H. Adaptation in Natural and Artificial Systems; University of Michigan Press: Ann Arbor, MI, USA, 1975. [Google Scholar]
- Berchtold, A. The Double Chain Markov Model. Commun. Stat. Theory Methods
**1999**, 28, 2569–2589. [Google Scholar] [CrossRef] [Green Version] - Berchtold, A. High-Order Extensions of the Double Chain Markov Model. Stoch. Model.
**2002**, 18, 193–227. [Google Scholar] [CrossRef] [Green Version] - McLachlan, G.J.; Krishnan, T. EM Algorithm and Extensions; John Wiley & Sons: New York, NY, USA, 1996. [Google Scholar]
- Deb, K.; Agrawal, R.B. Simulated Binary Crossover for Continuous Search Space. Complex Syst.
**1995**, 9, 115–148. [Google Scholar] - Blickle, T.; Thiele, L. A Comparison of Selection Schemes Used in Evolutionary Algorithms. Evol. Comput.
**1996**, 4, 361–394. [Google Scholar] [CrossRef] - Kass, R.E.; Raftery, A.E. Bayes Factors. J. Am. Stat. Assoc.
**1985**, 90, 773–795. [Google Scholar] [CrossRef] - Xie, L.; Li, F.; Zhang, L.; Widagdo, F.R.A.; Dong, L. A Bayesian Approach to Estimating Seemingly Unrelated Regression for Tree Biomass Model Systems. Forests
**2020**, 11, 1302. [Google Scholar] [CrossRef] - Zhang, Q.; Sun, J.; Tsang, E.P.K. Combinations of estimation of distribution algorithms and other techniques. Int. J. Autom. Comput.
**2007**, 4, 273–280. [Google Scholar] [CrossRef] [Green Version]

**Figure 1.**Solution space of a second-order MTDg model depending on two parameters, ${\theta}_{1}$ and ${\theta}_{2}$.

**Table 1.**Main characteristics of the different Markovian models computed on the Employment.2 dataset. The first part of the table summarizes the second-order MTDg models with two covariates obtained from different optimization procedures, and the second part of the table summarizes the other Markovian models appearing in Appendix A. Regarding the different optimization procedures, HC means hill-climbing, GEM means general expectation-maximization algorithm, EA means evolutionary algorithm, and exact indicates that there is an exact solution to the maximization of the log-likelihood.

Model | Model | Optimization | Number of | Log-Likelihood | BIC |
---|---|---|---|---|---|

Name | Structure | Procedure | Parameters | ||

Model.1 | MTDg2 + cov | HC | 10 | −1813.737 | 3718.846 |

Model.2 | MTDg2 + cov | GEM | 10 | −1813.782 | 3718.935 |

Model.3 | MTDg2 + cov | EA | 6 | −1895.900 | 3846.624 |

Model.3a | MTDg2 + cov | EA + HC | 6 | −1818.598 | 3692.019 |

Model.3b | MTDg2 + cov | EA + GEM | 8 | −1813.955 | 3701.008 |

Model.4 | MTDg2 + cov | EA/GEM | 11 | −1813.735 | 3727.979 |

Model.indep | Independence | exact | 1 | −6433.625 | 12876.390 |

Model.mc1 | MC1 | exact | 2 | −2064.818 | 4147.910 |

Model.mc2 | MC2 | exact | 4 | −2055.297 | 4147.193 |

Model.mtd2 | MTD2 | HC | 3 | −2057.873 | 4143.159 |

Model.mtdg2 | MTDg2 | HC | 4 | −2057.873 | 4152.296 |

Model.hmm2 | HMM2 | GEM | 5 | −2288.941 | 4623.568 |

Model.hmm3 | HMM3 | GEM | 11 | −2265.008 | 4630.526 |

Model.dcmm2 | DCMM2 | GEM | 7 | −2063.297 | 4190.555 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Berchtold, A.; Maitre, O.; Emery, K.
Optimization of the Mixture Transition Distribution Model Using the March Package for R. *Symmetry* **2020**, *12*, 2031.
https://doi.org/10.3390/sym12122031

**AMA Style**

Berchtold A, Maitre O, Emery K.
Optimization of the Mixture Transition Distribution Model Using the March Package for R. *Symmetry*. 2020; 12(12):2031.
https://doi.org/10.3390/sym12122031

**Chicago/Turabian Style**

Berchtold, André, Ogier Maitre, and Kevin Emery.
2020. "Optimization of the Mixture Transition Distribution Model Using the March Package for R" *Symmetry* 12, no. 12: 2031.
https://doi.org/10.3390/sym12122031