Our study tracks the development of 105 Roma children between 3 and 5 (median age: 51 months), enrolled in an NGO-aided developmental program. Each child undergoes pre- and post-assessment based on the Developmental Assessment of Young Children (DAYC), a standard tool used to
[...] Read more.
Our study tracks the development of 105 Roma children between 3 and 5 (median age: 51 months), enrolled in an NGO-aided developmental program. Each child undergoes pre- and post-assessment based on the Developmental Assessment of Young Children (DAYC), a standard tool used to track the progress in early childhood development and detect delays. Data are gathered from three sources, teacher, parent/caregiver and specialist, covering four developmental domains and adaptive behavior scale. There are subjective biases; however, in the post-assessment, the teachers’ and parents’ evaluations converge. The test results confirm significant improvement in all areas (
), with the highest being in cognitive skills
and the lowest being in physical development
. We also apply machine learning methods to impute missing data and predict the likely future progress for a given student in the program based on the initial input, while also evaluating the influence of environmental factors. Our weighted ensemble regression models are coupled with principal component analysis (PCA) and yield average coefficients of determination
for the features of interest. Also, we perform k-means clustering in the plane cognitive vs. social–emotional progress and consider the classification problem of predicting the group in which a given student would eventually be assigned to, with a weighted
-score of
and a macro-averaged area under the curve (AUC) of
. This could be useful in practice for the optimized formation of study groups. We explore classification as a means of imputing missing categorical data too, e.g., education, employment or marital status of the parents. Our algorithms provide solutions with the
-score ranging from
to
and, respectively, an AUC between
and 1.
Full article