This study examines the increasing concern regarding teacher job satisfaction, which has a direct impact on retention, instructional quality, and student outcomes. Traditionally, teacher satisfaction has been evaluated through questionnaires, which present limitations in terms of data efficiency and analyses. In this study,
[...] Read more.
This study examines the increasing concern regarding teacher job satisfaction, which has a direct impact on retention, instructional quality, and student outcomes. Traditionally, teacher satisfaction has been evaluated through questionnaires, which present limitations in terms of data efficiency and analyses. In this study, machine learning techniques were applied to data from the PISA 2022 teacher questionnaire in Morocco (N = 2998 lower-secondary teachers). Two multiclass classification targets were defined: overall job satisfaction (SATJOB_class) and satisfaction with the teaching profession (SATTEACH_class), each categorised into three balanced classes: low (<−0.5), medium (−0.5 to 0.5), and high (>0.5) classes. The methodology comprised four key stages. Initially, comprehensive pre-processing was conducted to address missing values, retaining features with fewer than 300 missing entries and applying mode imputation. Subsequently, nine classifiers, including logistic regression, K-nearest neighbours, multinomial naïve Bayes, support vector machine, decision tree, random forest, XGBoost, AdaBoost, and a feed-forward Artificial Neural Network, were evaluated using identical train/test splits and hyperparameter tuning. Third, the model performance was assessed using accuracy, precision, recall, and F1-score. Finally, the feature importance was derived from tree-based and permutation methods. The results indicated that XGBoost outperformed the other models for SATJOB_class with an accuracy (0.61), precision (0.62), recall (0.61), and F1-score (0.61), followed by Random Forest (accuracy = 0.59), Logistic Regression (accuracy = 0.59), and AdaBoost (accuracy = 0.59). For SATTEACH_class, Random Forest led with accuracy (0.59), followed closely by XGBoost (0.58), ANN (0.57), and AdaBoost (0.56). Key predictors of teacher job satisfaction included workload-related variables and school-environment factors, which consistently emerged as the most important features across the best-performing models. The methodology and open-source pipeline provide a reproducible framework for evidence-based interventions to improve teacher retention and instructional quality, offering valuable insights for policymakers and educational administrators.
Full article