# A Mixture Hidden Markov Model to Mine Students’ University Curricula

^{*}

^{†}

## Abstract

**:**

## 1. Introduction

## 2. Data Description

## 3. Mixture Hidden Markov Models for Sequence Data

`seqHMM`[38,39] and details on the related algorithms are provided in [40]. Here, it is worth outlining that, to avoid the problem of local maxima that is typical of mixture models, we suggest to repeat the estimation process a certain number of times with random starting values.

## 4. Analysis of Student Paths

`seqHMM`. We first illustrate and discuss results related to the HM model (Section 4.2) and, then, we extend the analysis to the MHM with covariates (Section 4.3 and Section 4.4). The discussion of results focuses on the differences among courses in the tendency to postpone the final tests and on the interpretation of the latent structure of the student population.

#### 4.1. Model Specification

#### 4.2. Hidden Markov Model

#### 4.3. Mixture Hidden Markov Model

#### 4.4. Effect of Concomitant Variables

## 5. Conclusions

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Conflicts of Interest

## Abbreviations

AIC | Akaike Information Criterion |

BIC | Bayesian Information Criterion |

EM | Expectation-Maximization |

HS | High school |

HM | Hidden Markov |

MHM | Mixture Hidden Markov |

## Appendix A

Symbol | Description |
---|---|

$i=1,\dots ,n$ | individual (student) |

$j=1,\dots ,J$ | channel (exam) |

$t=1,\dots ,{T}_{i}$ | time occasion |

$h=1,\dots ,H$ | discrete state space |

$k=1,\dots ,K$ | latent class |

${y}_{itj}$ | observed binary state for student i at time t on exam j |

${y}_{ij}={({y}_{i1j},\dots ,{y}_{itj},\dots ,{y}_{iTj})}^{\prime}$ | sequence of observed binary states for student i |

on exam j | |

${x}_{i}$ | vector of time-constant characteristics for student i |

${U}_{it}$ | discrete time-dependent latent variable |

${V}_{i}$ | discrete time-constant latent variable |

${u}_{t}$ | hidden state of ${U}_{it}$ at time t |

${v}_{k}$ | support point of ${V}_{i}$ for latent class k |

$p({U}_{i1}={u}_{1})$ | initial probability of starting from hidden state ${u}_{1}$ |

$p({U}_{it}={u}_{t}|{U}_{i,t-1}={u}_{t-1})$ | transition probability of moving from |

hidden state ${u}_{t-1}$ to hidden state ${u}_{t}$ | |

${\pi}_{k}=p({V}_{i}={v}_{k})$ | mass probability (or weight) of ${v}_{k}$ |

${\pi}_{k}\left({x}_{i}\right)=p({V}_{i}={v}_{k}|{X}_{i}=x)$ | subject-specific mass probability (or weight) of ${v}_{k}$ |

$p\left({y}_{itj}\right|{U}_{it}={u}_{t})$ | conditional probability of observed state |

given hidden state | |

$p\left({y}_{ij}\right|{U}_{i}=u,{V}_{i}={v}_{k})$ | conditional probability of sequence of observed states |

given hidden states and latent class | |

${\beta}_{0},{\beta}_{1}$ | regression coefficients |

## References

- Bakhshinategh, B.; Zaiane, O.R.; ElAtia, S.; Ipperciel, D. Educational data mining applications and tasks: A survey of the last 10 years. Educ. Inf. Technol.
**2018**, 23, 537–553. [Google Scholar] [CrossRef] - Romero, C.; Ventura, S. Educational data mining: A review of the state of the art. IEEE Trans. Syst. Man Cybern.
**2010**, 40, 601–618. [Google Scholar] [CrossRef] - Penã-Ayala, A. Educational data mining: A survey and a data mining-based analysis of recent works. Expert Syst. Appl.
**2014**, 41, 1432–1462. [Google Scholar] [CrossRef] - Daniel, B.K. Big data and data science: A critical review of issues for educational research. Br. J. Educ. Technol.
**2019**, 50, 101–113. [Google Scholar] [CrossRef] [Green Version] - Romero, C.; Ventura, S. Educational data mining and learning analytics: An updated survey. WIREs Data Min. Knowl. Discov.
**2020**, 10, e1355. [Google Scholar] [CrossRef] - Campagni, R.; Merlini, D.; Sprugnoli, R.; Verri, M. Data mining models for student careers. Expert Syst. Appl.
**2015**, 42, 5508–5521. [Google Scholar] [CrossRef] - Bacci, S.; Bartolucci, F.; Grilli, L.; Rampichini, C. Evaluation of student performance through a multidimensional finite mixture IRT model. Multivar. Behav. Res.
**2017**, 52, 732–746. [Google Scholar] [CrossRef] [PubMed] - Campagni, R.; Merlini, D.; Verri, M. The Influence of First Year Behaviour in the Progressions of University Students. In Computers Supported Education. CSEDU 2017. Communications in Computer and Information Science; Escudeiro, P., Costagliola, G., Zvacek, S., Uhomoibhi, J., McLaren, B., Eds.; Springer: Berlin/Heidelberg, Germany, 2018; Volume 865, pp. 343–362. [Google Scholar]
- Berens, J.; Schneider, K.; Gortz, S.; Oster, S.; Burghoff, J. Early Detection of Students at Risk. Predicting Student Dropouts Using Administrative Student Data from German Universities and Machine Learning Methods. J. Educ. Data Min.
**2019**, 11, 1–41. [Google Scholar] - Pelaez, K.; Levine, R.; Fan, J.; Guarcello, M.; Laumakis, M. Using a Latent Class Forest to Identify At-Risk Students in Higher Education. J. Educ. Data Min.
**2019**, 11, 18–46. [Google Scholar] - Wong, J.C.F.; Yip, T.C.Y. Measuring students’ academic performance through educational data mining. Int. J. Inf. Educ. Technol.
**2020**, 10, 797–804. [Google Scholar] [CrossRef] - Abbott, A.; Tsay, A. Sequence Analysis and Optimal Matching Methods in Sociology. Sociol. Methods Res.
**2000**, 29, 3–33. [Google Scholar] [CrossRef] - Gauthier, J.A.; Widmer, E.D.; Bucher, P.; Notredame, C. Multichannel squence analysis applied to social science data. Sociol. Methodol.
**2010**, 40, 1–38. [Google Scholar] [CrossRef] [Green Version] - Skrondal, A.; Rabe-Hesketh, S. Generalized Latent Variable Modeling. Multilevel, Longitudinal and Structural Equation Models; Chapman and Hall/CRC: London, UK, 2004. [Google Scholar]
- Bartholomew, D.J.; Knott, M.; Moustaki, I. Latent Variable Models and Factor Analysis: A Unified Approach; Wiley: Chichester, UK, 2011. [Google Scholar]
- McLachlan, G.; Peel, D. Finite Mixture Models; Wiley: New York, NY, USA, 2000. [Google Scholar]
- Hancock, G.R.; Samuelson, K.M. Advances in Latent Variable Mixture Models; Information Age Publishing: Charlotte, NC, USA, 2008. [Google Scholar]
- Muthén, B.O.; Shedden, K. Finite mixture modelling with mixture outcomes using the EM algorithm. Biometrics
**1999**, 55, 463–469. [Google Scholar] [CrossRef] [PubMed] - Muthén, B.O. Latent variable analysis: Growth mixture modelling and related techniques for longitudinal data. In Handbook of Quantitative Methodology for the Social Sciences; Kaplan, D., Ed.; Sage: Newbury Park, CA, USA, 2004; pp. 345–368. [Google Scholar]
- Kreuter, F.; Muthén, B.O. Longitudinal modeling of population heterogeneity: Methodological challenges to the analysis of empirically derived criminal trajectory profiles. In Advances in Latent Variable Mixture Models; Hancock, G.R., Samuelsen, K.M., Eds.; Information Age Publishing, Inc.: Charlotte, NC, USA, 2008; pp. 53–75. [Google Scholar]
- Zucchini, W.; MacDonald, I.L. Hidden Markov Models for Time Series: An Introduction Using R; Springer: Berlin/Heidelberg, Germany, 2009. [Google Scholar]
- Bartolucci, F.; Farcomeni, A.; Pennoni, F. Latent Markov Models for Longitudinal Data; Chapman & Hall/CRC Press: Boca Raton, FL, USA, 2012. [Google Scholar]
- Chi, M.; VanLehn, K.; Litman, D.; Jordan, P. Empirically evaluating the application of reinforcement learning to the induction of effective and adaptive pedagogical strategies. User Model. User-Adapt. Interact.
**2011**, 21, 137–180. [Google Scholar] [CrossRef] - Köck, M.; Paramythis, A. Activity sequence modelling and dynamic clustering for personalized e-learning. User Model. User-Adapt. Interact.
**2011**, 21, 51–97. [Google Scholar] [CrossRef] - Tsuruta, S.; Knauf, R.; Doht, S.; Kawabe, T.; Sakurai, Y. An intelligent system for modeling and supporting academic educational processes. In Intelligent and Adaptive Educational-Learning Systems: Achievements and Trends, Smart Innovation, Systems and Technologies; Penã-Ayala, A., Ed.; Springer: Berlin/Heidelberg, Germany, 2012; pp. 469–496. [Google Scholar]
- Altman, R.J. Mixed hidden Markov models: An extension of the hidden Markov model to the longitudinal data setting. J. Am. Stat. Assoc.
**2007**, 102, 201–210. [Google Scholar] [CrossRef] - Maruotti, A. Mixed Hidden Markov Models for Longitudinal Data: An Overview. Int. Stat. Rev.
**2011**, 79, 427–454. [Google Scholar] [CrossRef] - van de Pol, F.; Langheine, R. Mixed Markov Latent Class Models. Sociol. Methodol.
**1990**, 20, 213–247. [Google Scholar] [CrossRef] - Vermunt, J.K.; Magidson, J. Latent Class Models in Longitudinal Research. In Handbook of Longitudinal Research: Design, Measurement, and Analysis; Menard, S., Ed.; Elsevier: Burlington, MA, USA, 2008; pp. 373–385. [Google Scholar]
- Lazarsfeld, P.F.; Henry, N.W. Latent Structure Analysis; Houghton Mifflin: Boston, MA, USA, 1968. [Google Scholar]
- Goodman, L.A. Exploratory latent structure analysis using both identifiable and unidentifiable models. Biometrika
**1974**, 61, 215–231. [Google Scholar] [CrossRef] - Bandeen-Roche, K.; Miglioretti, D.L.; Zeger, S.L.; Rathouz, P.J. Latent Variable Regression for Multiple Discrete Outcomes. J. Am. Stat. Assoc.
**1997**, 92, 1375–1386. [Google Scholar] [CrossRef] - Formann, A.K. Mixture analysis of multivariate categorical data with covariates and missing entries. Comput. Stat. Data Anal.
**2007**, 51, 5236–5246. [Google Scholar] [CrossRef] - Bacci, S.; Bertaccini, B. Finding the best paths in university curricula of graduates to improve academic guidance services. In Book of Short Papers SIS 2018; Abbruzzo, A., Brentari, E., Chiodi, M., Piacentino, D., Eds.; Pearson: London, UK, 2018; pp. 615–622. [Google Scholar]
- Akaike, H. Information theory and an extension of the maximum likelihood principle. In Second International Symposium of Information Theory; Petrov, B.N., Csaki, F., Eds.; Akademiai Kiado: Budapest, Hungar, 1973; pp. 267–281. [Google Scholar]
- Schwarz, G. Estimating the dimension of a model. Ann. Stat.
**1978**, 6, 461–464. [Google Scholar] [CrossRef] - Dempster, A.P.; Laird, N.M.; Rubin, D.B. Maximum likelihood from incomplete data via the EM algorithm (with discussion). J. R. Stat. Soc. Ser. B
**1977**, 39, 1–38. [Google Scholar] - Helske, S.; Helske, J. Mixture hidden Markov models for sequence data: The seqHMM package in R. J. Stat. Softw.
**2019**, 88, 1–32. [Google Scholar] [CrossRef] [Green Version] - Helske, J.; Helske, S. Mixture Hidden Markov Models for Social Sequence Data and Other Multivariate, Multichannel Categorical Time Series; R Package Version 1.2.0; University of Jyväskylä: Jyväskylä, Finland, 2021; Available online: https://cran.r-project.org/package=seqHMM (accessed on 18 February 2021).
- Helske, S. The Main Algorithms Used in the seqHMM Package. 2019. Available online: https://cran.r-project.org/package=seqHMM (accessed on 18 February 2021).
- Bacci, S.; Pandolfi, S.; Pennoni, F. A comparison of some criteria for states selection in the latent Markov model for longitudinal data. Adv. Data Anal. Classif.
**2014**, 8, 125–145. [Google Scholar] [CrossRef] [Green Version]

**Figure 1.**Observed state chronological sequences of first-year exams. Legend: exam taken (purple rectangle); exam not yet taken (green rectangle).

**Figure 2.**Observed state sequences of first-year exams, by gender (

**top**), HS type (

**middle**), and HS final grade (

**bottom**). Legend: exam taken (purple rectangle); exam not yet taken (green rectangle).

**Figure 3.**Observed state sequences of first-year courses, by status at the end of the follow-up period (

**top**), exams grade point average (

**middle**), and average time to the last exam (

**bottom**). Legend: exam taken (purple rectangle); exam not yet taken (green rectangle).

Student | Exam | Status | $\mathbf{ts}1$ | $\mathit{ts}2$ | $\mathit{ts}3$ | … | $\mathit{ts}24$ | $\mathit{ts}25$ | … | $\mathit{ts}48$ |
---|---|---|---|---|---|---|---|---|---|---|

1 | Accounting | enrolled | 0 | 0 | 1 | 1 | 1 | 1 | … | 1 |

1 | Mathematics | enrolled | 0 | 0 | 0 | 0 | 0 | 0 | … | 0 |

… | … | … | … | … | … | … | … | … | … | … |

2 | Mathematics | graduates | 0 | 1 | 1 | 1 | 1 | NA | … | NA |

Course | Hidden State | Emission Prob. |
---|---|---|

Accounting | ||

State 1 | 0.864 | |

State 2 | 0.982 | |

Mathematics | ||

State 1 | 0.319 | |

State 2 | 0.607 | |

Private law | ||

State 1 | 0.411 | |

State 2 | 0.774 | |

Management | ||

State 1 | 0.714 | |

State 2 | 0.987 | |

Microeconomics | ||

State 1 | 0.400 | |

State 2 | 0.829 | |

Statistics | ||

State 1 | 0.523 | |

State 2 | 0.907 |

**Table 3.**MHM models for $K=1,\dots ,10$: maximum log-likelihood $\widehat{\ell}$, number of free model parameters, BIC values, relative difference between consecutive BIC values (delta).

K | $\widehat{\mathit{\ell}}$ | # par. | BIC | Delta |
---|---|---|---|---|

1 | −93,482.0 | 58 | 187,497.2 | – |

2 | −88,708.0 | 117 | 178,491.6 | −0.048 |

3 | −86,138.3 | 176 | 173,894.7 | −0.026 |

4 | −84,680.2 | 235 | 171,520.8 | −0.014 |

5 | −82,988.5 | 294 | 168,679.9 | −0.017 |

6 | −82,127.9 | 353 | 167,501.0 | −0.007 |

7 | −81,204.7 | 412 | 166,197.0 | −0.008 |

8 | −80,145.0 | 471 | 164,620.1 | −0.009 |

9 | −79,074.0 | 530 | 163,020.5 | −0.010 |

10 | −78,774.5 | 589 | 162,963.9 | −0.000 |

**Table 4.**MHM model with concomitant variables: Average estimated class membership probabilities, number of students, estimated transition probabilities from state 1 to state 2.

Class 1 | Class 2 | Class 3 | Class 4 | Class 5 | |
---|---|---|---|---|---|

Avg. class prob. (${\overline{\widehat{\pi}}}_{k}\left({x}_{i}\right)$) | 0.279 | 0.169 | 0.148 | 0.172 | 0.231 |

# of students | 81 | 49 | 43 | 50 | 67 |

$\widehat{p}({U}_{it}=2|{U}_{i,t-1}=1)$ | 0.067 | 0.056 | 0.057 | 0.049 | 0.049 |

**Table 5.**MHM model with concomitant variables: Estimated emission probabilities by latent class (only first-year courses).

Course | Hidden State | Class 1 | Class 2 | Class 3 | Class 4 | Class 5 |
---|---|---|---|---|---|---|

Accounting | ||||||

State 1 | 0.962 | 0.933 | 0.914 | 0.593 | 0.929 | |

State 2 | 1.000 | 1.000 | 1.000 | 0.904 | 1.000 | |

Mathematics | ||||||

State 1 | 0.739 | 0.626 | 0.269 | 0.049 | 0.013 | |

State 2 | 0.972 | 0.962 | 0.656 | 0.245 | 0.363 | |

Private law | ||||||

State 1 | 0.770 | 0.000 | 0.658 | 0.322 | 0.241 | |

State 2 | 1.000 | 0.461 | 0.998 | 0.767 | 0.725 | |

Management | ||||||

State 1 | 0.760 | 0.734 | 0.746 | 0.504 | 0.802 | |

State 2 | 1.000 | 0.981 | 0.981 | 0.958 | 1.000 | |

Microeconomics | ||||||

State 1 | 0.555 | 0.553 | 0.487 | 0.155 | 0.322 | |

State 2 | 0.972 | 0.968 | 0.851 | 0.600 | 0.800 | |

Statistics | ||||||

State 1 | 0.691 | 0.616 | 0.575 | 0.137 | 0.620 | |

State 2 | 1.000 | 0.974 | 0.934 | 0.280 | 0.994 |

**Table 6.**Performance of students at the end of the follow-up, by latent class: students’ status (proportion), exams grade point average (average values for all students and only for graduates), time to the last exam (average values for all students and only for graduates).

Variable | Class 1 | Class 2 | Class 3 | Class 4 | Class 5 | All |
---|---|---|---|---|---|---|

Status | ||||||

Enrolled | 0.037 | 0.102 | 0.093 | 0.560 | 0.299 | 0.207 |

Graduated | 0.963 | 0.898 | 0.907 | 0.440 | 0.672 | 0.786 |

Retired | 0.000 | 0.000 | 0.000 | 0.000 | 0.030 | 0.007 |

Exams grade point avg. (all) | 25.7 | 24.5 | 24.9 | 23.2 | 23.9 | 24.5 |

Exams grade point avg. (grad.) | 25.8 | 24.6 | 25.1 | 23.3 | 24.2 | 24.9 |

Time to the last exam (all) | 1287.8 | 1556.1 | 1409.4 | 1876.9 | 1741.1 | 1557.5 |

Time to the last exam (grad.) | 1281.0 | 1499.5 | 1362.3 | 1797.7 | 1661.2 | 1462.0 |

**Table 7.**Multinomial logit sub-model for latent class membership (reference is class 1): Estimated regression coefficients with standard errors, t statistics, and p-values.

Estimate | Std. Error | t-Stat | p-Value | |
---|---|---|---|---|

Latent class 2 | ||||

Intercept | 3.360 | 1.471 | 2.283 | 0.022 |

Male | 0.124 | 0.403 | 0.308 | 0.758 |

HS type (ref.: Humanistic or scientific) | ||||

Technical | −0.801 | 0.516 | −1.552 | 0.121 |

Vocational (or others) | 1.387 | 0.677 | 2.048 | 0.041 |

HS final grade | −0.049 | 0.018 | −2.807 | 0.005 |

Latent class 3 | ||||

Intercept | 2.542 | 1.513 | 1.681 | 0.093 |

Male | 0.225 | 0.428 | 0.526 | 0.599 |

HS type (ref.: Humanistic or scientific) | ||||

Technical | 0.769 | 0.442 | 1.739 | 0.082 |

Vocational or others | 2.301 | 0.681 | 3.381 | 0.001 |

HS final grade | −0.048 | 0.018 | −2.708 | 0.007 |

Latent class 4 | ||||

Intercept | 5.686 | 1.380 | 4.119 | 0.000 |

Male | −0.442 | 0.409 | −1.081 | 0.280 |

HS type (ref.: Humanistic or scientific) | ||||

Technical | 0.287 | 0.458 | 0.626 | 0.531 |

Vocational or others | 1.948 | 0.675 | 2.888 | 0.004 |

HS final grade | −0.081 | 0.017 | −4.880 | 0.000 |

Latent class 5 | ||||

Intercept | 5.061 | 1.351 | 3.746 | 0.000 |

Male | −0.990 | 0.385 | −2.571 | 0.010 |

HS type (ref.: Humanistic or scientific) | ||||

Technical | 1.151 | 0.403 | 2.859 | 0.004 |

Vocational or others | 2.427 | 0.638 | 3.805 | 0.000 |

HS final grade | −0.070 | 0.016 | −4.333 | 0.000 |

Variable | Class 1 | Class 2 | Class 3 | Class 4 | Class 5 | All |
---|---|---|---|---|---|---|

Gender (proportions) | ||||||

females | 0.469 | 0.388 | 0.395 | 0.480 | 0.642 | 0.486 |

males | 0.531 | 0.612 | 0.605 | 0.520 | 0.358 | 0.514 |

HS type (proportions) | ||||||

humanistic or scientific | 0.642 | 0.714 | 0.419 | 0.480 | 0.328 | 0.521 |

technical | 0.309 | 0.122 | 0.349 | 0.220 | 0.373 | 0.283 |

vocational or others | 0.049 | 0.163 | 0.233 | 0.300 | 0.299 | 0.197 |

HS final grade (averages) | 82.1 | 75.9 | 76.8 | 66.3 | 73.5 | 75.6 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Bacci, S.; Bertaccini, B.
A Mixture Hidden Markov Model to Mine Students’ University Curricula. *Data* **2022**, *7*, 25.
https://doi.org/10.3390/data7020025

**AMA Style**

Bacci S, Bertaccini B.
A Mixture Hidden Markov Model to Mine Students’ University Curricula. *Data*. 2022; 7(2):25.
https://doi.org/10.3390/data7020025

**Chicago/Turabian Style**

Bacci, Silvia, and Bruno Bertaccini.
2022. "A Mixture Hidden Markov Model to Mine Students’ University Curricula" *Data* 7, no. 2: 25.
https://doi.org/10.3390/data7020025