You are currently viewing a new version of our website. To view the old version click .
Algorithms
  • This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
  • Article
  • Open Access

16 December 2025

Net Rural Migration Classification in Colombia Using Supervised Decision Tree Algorithms

,
and
1
Facultad de Ingeniería, Universidad Distrital Francisco José de Caldas, Bogotá 110231, Colombia
2
Facultad de Ingeniería, Universidad Militar Nueva Granada, Bogotá 110111, Colombia
*
Author to whom correspondence should be addressed.
This article belongs to the Special Issue Algorithms in Data Classification (3rd Edition)

Abstract

This study presents a decision tree model-based approach to classify rural net migration across Colombian departments using sociodemographic and economic variables. In the model formulation, immigration is considered the movement of people to a destination area to settle there, while emigration is the movement of people from that specific area to other places. The target variable was defined as a binary category representing positive (when the immigration is greater than emigration) or negative net migration. Four classification models were trained and evaluated: Decision Tree, Random Forest, AdaBoost, and XGBoost. Data were preprocessed using cleaning techniques, categorical variable encoding, and class balance assessment. Model performance was evaluated using various metrics, including accuracy, precision, sensitivity, F1 score, and the area under the ROC curve. The results show that Random Forest achieves the highest accuracy, precision, sensitivity, and F1 score in the 10-variable and 15-variable settings, while XGBoost is competitive but not dominant. Furthermore, the importance of the model was analyzed to identify key factors influencing migration patterns. This approach allows for a more precise understanding of regional migration dynamics in Colombia and can serve as a basis for designing informed public policies.

Article Metrics

Citations

Article Access Statistics

Article metric data becomes available approximately 24 hours after publication online.