Machine Learning Algorithms for Predicting Stunting among Under-Five Children in Papua New Guinea

Preventing stunting is particularly important for healthy development across the life course. In Papua New Guinea (PNG), the prevalence of stunting in children under five years old has consistently not improved. Therefore, the primary objective of this study was to employ multiple machine learning algorithms to identify the most effective model and key predictors for stunting prediction in children in PNG. The study used data from the 2016–2018 Papua New Guinea Demographic Health Survey, including from 3380 children with complete height-for-age data. The least absolute shrinkage and selection operator (LASSO) and random-forest-recursive feature elimination were used for feature selection. Logistic regression, a conditional decision tree, a support vector machine with a radial basis function kernel, and an extreme gradient boosting machine (XGBoost) were employed to construct the prediction model. The performance of the final model was evaluated using accuracy, precision, recall, F1 score, and area under the curve (AUC). The results of the study showed that LASSO-XGBoost has the best performance for predicting stunting in PNG (AUC: 0.765; 95% CI: 0.714–0.819) with accuracy, precision, recall, and F1 scores of 0.728, 0.715, 0.628, and 0.669, respectively. Combined with the SHAP value method, the optimal prediction model identified living in the Highlands Region, the age of the child, being in the richest family, and having a larger or smaller birth size as the top five important characteristics for predicting stunting. Based on the model, the findings support the necessity of preventing stunting early in life. Emphasizing the nutritional status of vulnerable maternal and child populations in PNG is recommended to promote maternal and child health and overall well-being.


Introduction
Stunting has been defined as the lack of height relative to age in children [1] and is the most prevalent form of child malnutrition [2].Stunting occurs mainly during the critical window of 0-24 months [3], which is the most sensitive period of child growth and development.Stunting was found to be especially vulnerable to environmentally modifiable factors [4].This growth deficit continues to accumulate and worsens during early childhood (0-5 years) due to continued exposure to adverse environmental factors such as feeding, infections, and psychosocial factors [5].
The consequences of stunting observed within the first five years of life are far-reaching, encompassing increased morbidity and mortality, impaired cognitive development, poorer academic performance, physical developmental deficits, and diminished economic productivity [6].Despite some studies suggesting the possibility of catch-up growth in stunted children, there is no conclusive evidence to support the full reversal of the early-life effects of stunting [7,8].
As of 2020, approximately 149 million children under the age of five remain affected by stunting worldwide with the overwhelming majority of cases (96.7%) occurring in lowand middle-income countries [9].It is evident that stunting in children poses a significant global health challenge [1].In response, Target 2.2 of the Sustainable Development Goals (SDGs) states that all forms of malnutrition should be eliminated by 2030, which includes stunting in children under five years of age [10].
Despite impressive achievements in reducing stunting in the Western Pacific Region, progress remains slow in some countries [11].Papua New Guinea (PNG) is among the countries where stunting rates among children under five years old have persistently failed to improve, rising from 47.2% in 2000 to 48.4% in 2020.Surprisingly, this trend contradicts that of PNG's rapid economic growth [12].The increase in resources and wealth has not improved the nutritional status of children [13].Consequently, there is a need to address stunting in children under five years of age in PNG as a serious public health issue.
Previous studies from PNG have explored factors associated with stunting, such as regional disparities, wealth indices, maternal education level, and childhood vaccinations [14][15][16][17].However, these studies often relied on limited data and lacked national representativeness, limiting the generalizability of their results to the wider PNG population.A few studies have applied nationally representative data from the 2009-2010 Papua New Guinea Household Income and Expenditure Survey (PNG HIES) [18,19] to examine stunting prevalence variations across different regions in PNG.However, the timeliness of the data restricted their scope, and they only adjusted for a limited number of confounding variables.
Machine learning (ML) has emerged as a powerful data-mining technique that is particularly adept at handling high-dimensional and nonlinear relationships [20,21], surpassing classical statistical models in many aspects.As a result, ML algorithms have found widespread application in the exploration of the social determinants of health (SDHs) [21].The application of algorithms such as decision trees (DTrees), random forests (RFs), support vector machines (SVMs), gradient boosting machines (GBMs), extreme gradient boosting machines (XGBoosts), and neural networks (ANNs) is commonly used in studies exploring the factors associated with stunting in children [22][23][24][25][26][27][28].Evidence from Ethiopia, Tanzania, and Bangladesh [23,[26][27][28] showed that traditional logistic regression (LR) often fails to achieve optimal performance in predicting stunting in children compared to other ML algorithms.Consequently, the application of multiple ML algorithms becomes imperative to identify the best predictive model.
Feature selection (FS), a technique aimed at reducing dimensionality, plays a crucial role in optimizing an algorithm's predictive performance by eliminating redundant, irrelevant, and noisy features [29].FS is usually categorized into filtered, embedded, wrapper, and hybrid methods [30].Embedded methods employ built-in feature selection methods to optimize objective functions or classifiers [31], such as decision trees and the least absolute shrinkage and selection operator (LASSO).Conversely, wrapper methods employ repetitive learning steps and resampling techniques to evaluate feature usefulness and result in enhanced predictive capabilities but at a higher computational cost [32].
Given that the prevalence of stunting in children under five years of age in PNG is still not promising, there is a need for targeted programs and effective interventions.Therefore, the main objective of this study is to apply FS techniques with ML algorithms to train, evaluate, and select the best model for predicting stunting in children under five years of age in PNG based on the nationally representative 2016-2018 Papua New Guinea Demographic Health Survey database (2016-2018 PNG DHS) in addition to obtaining the most important features for predicting stunting.The study's findings will provide evidence for PNG policy makers to plan scientifically sound programs with integrated interventions to prevent child stunting and protect the health of the most vulnerable subgroups of children.This will help accelerate PNG's progress in the SDGs related to children's health.

Data Source
The cross-sectional data used in this study were obtained from the 2016-2018 PNG DHS, which was conducted by the PNG National Statistics Office (NSO).This comprehen-sive national survey covered individuals aged 15-49 years in PNG with the aim to provide current information on key demographic and health indicators.The survey employed a two-stage stratified sampling method to select approximately 19,200 households, with 18,175 women aged 15-49 from the surveyed households eligible for individual interviews.A total of 15,198 women completed the interviews with a response rate of 84%.Child information was collected from mothers or primary caregivers.Structured questionnaires were applied for data collection, and details about the survey can be found in the 2016-2018 PNG DHS final report [33].For households where male participants were selected for interviews, the 2016-2018 PNG DHS conducted height, length, weight, and mid-upper arm circumference (MUAC) measurements for eligible children under five years of age using equipment provided by UNICEF [33].Ultimately, all children under five years of age with complete and valid height-for-age data were included in this study with a total of 3380 children meeting the inclusion criteria.
The 2016-2018 Papua New Guinea Demographic and Health Survey (PNG DHS) received ethical approval from the Institutional Review Board of Inner City Fund International.Additionally, informed consent was obtained from respondents for all interviews.On 17 August 2023, the DHS program approved the use of this dataset for this study.All data were desensitized (anonymized by removing all personal identifiers) before being received by the authors.This study was conducted in accordance with relevant guidelines and regulations regarding the published use of DHS datasets and did not require additional ethical review documentation or informed consent due to the use of open secondary data.Further information about DHS data and ethical standards is available at https://dhsprogram.com/methodology/Protecting-the-Privacy-of-DHS-Survey-Respondents.cfm(accessed on 17 July 2023).

Outcome Variable and Potential Risk Factors
Our outcome variable of interest was stunting, which was coded as a binary variable.According to criteria developed by the WHO in 2006, children with height-for-age z-scores (HAZs) that are 2 standard deviations below the WHO growth standards are recognized as stunted [1] and coded as 1, while all others are coded as 0. The conceptual framework proposed by the United Nations Children's Nutrition Foundation (UNICEF) illustrates that stunting is attributed to complex contextual, underlying, and direct causes [34].Therefore, based on the results of previous studies, this study incorporated potential factors and divided them into four main categories: individual characteristics, maternal characteristics, family characteristics, and community characteristics.Individual characteristics included the child's gender, age, birth size, birth order, duration of breastfeeding, early breastfeeding, and occurrence of diarrhea and fever in the last two weeks.Maternal characteristics included the mother's age (years), employment status, occupation, marital status, education level, age at first birth (years), exposure to mass media, and their partner's age (years), employment status, and education level.Following WHO recommendations [35], early breastfeeding was defined as the initiation of breastfeeding within one hour of delivery.Breastfeeding duration was categorized as never breastfeeding, a breastfeeding duration < 6 months, and a breastfeeding duration ≥ 6 months [36].Exposure to mass media was based on the frequency of women reading newspapers, watching television, and listening to the radio; access to at least one of these media was considered exposure to mass media [37].
The household characteristics encompassed the sex of the househead, the number of children under five years of age, the number of household members, the type of latrine, the source of drinking water, the type of fuel for the kitchen, and the distance to the health facility.Community characteristics covered the place of residence as well as the region.Based on the WHO/UNICEF guidelines [38], the source of drinking water was categorized as unimproved or improved, and the type of latrine was categorized as unfurnished, unimproved, or improved.Based on WHO indoor air quality guidelines [39], kitchen fuel types were categorized as clean or polluting fuels, where clean fuels included electricity and liquefied petroleum gas.Household wealth was a composite index constructed by the 2016-2018 PNG DHS, where a principal component analysis was applied based on the household's consumer goods and housing characteristics, forming the corresponding household wealth quintiles: poorest, poorer, middle, richer, and richest [33].
2.3.Analytic Strategy 2.3.1.Preprocessing Data preprocessing was performed using STATA 17.0 statistical software.We conducted an initial screening of categorical and continuous variables using the χ2 (bivariate) test with the Wilcoxon rank sum test, and variables with a p > 0.05 were excluded.Descriptive analyses were performed in the form of frequencies for categorical variables and means for continuous variables.
To prepare categorical features for machine learning input, the classical one-hot coding method was employed.After the initial screening, multicategorical variables were converted into multiple binary feature vectors using one-hot coding.This approach ensured that the algorithm did not make erroneous assumptions about variable relationships.Furthermore, the missing indicator method (MIM) was utilized to add indicator metrics to categorical variables containing missing values.The brief analysis steps of this study are shown in Figure 1.
Based on the WHO/UNICEF guidelines [38], the source of drinking water was categorized as unimproved or improved, and the type of latrine was categorized as unfurnished, unimproved, or improved.Based on WHO indoor air quality guidelines [39], kitchen fuel types were categorized as clean or polluting fuels, where clean fuels included electricity and liquefied petroleum gas.Household wealth was a composite index constructed by the 2016-2018 PNG DHS, where a principal component analysis was applied based on the household's consumer goods and housing characteristics, forming the corresponding household wealth quintiles: poorest, poorer, middle, richer, and richest [33].

Preprocessing
Data preprocessing was performed using STATA 17.0 statistical software.We conducted an initial screening of categorical and continuous variables using the χ2 (bivariate) test with the Wilcoxon rank sum test, and variables with a p > 0.05 were excluded.Descriptive analyses were performed in the form of frequencies for categorical variables and means for continuous variables.
To prepare categorical features for machine learning input, the classical one-hot coding method was employed.After the initial screening, multicategorical variables were converted into multiple binary feature vectors using one-hot coding.This approach ensured that the algorithm did not make erroneous assumptions about variable relationships.Furthermore, the missing indicator method (MIM) was utilized to add indicator metrics to categorical variables containing missing values.The brief analysis steps of this study are shown in Figure 1.

Feature Selection
All subsequent analyses were conducted using RStudio 4.2.3 statistical software.The samples were randomly split into test and training sets at a ratio of 1:9, and feature selection was performed only in the training sets to prevent leakage of test data.To mitigate the risk of overfitting, the AUC value under 10-fold cross-validation (CV) served as the performance evaluation metric [40].
We selected the embedded LASSO and the wrapped random-forest-recursive feature elimination (RF-RFE) for feature selection.Among them, the LASSO controlled model shrinkage via the penalty parameter (λ).By selecting the λ value that produced the highest AUC value, we identified the features with non-zero regression coefficients, forming the optimal feature subset.Alternatively, the RF-RFE measured feature importance using the Gini-coefficient-based Mean Impurity Reduction (MDI) after fitting the RF model.The process was repeated to recursively eliminate irrelevant features until the combination of features with the highest model AUC value was derived.For this analysis, the ntree was set to 500, and the mtry was set to the recommended √ p [41].

Machine Learning Algorithms and Hyperparameter Tuning
Grid search (GS) is a traversal search for predefined hyperparameter values performed by a given algorithm.While it is suitable for low-dimensional hyperparameter tuning, it incurs high computational costs [42].Bayesian optimization based on Gaussian process regression (BO-GPR) leverages a priori information from Gaussian process regression to rapidly converge to the global optimal solution, making it more adept at handling high-dimensional hyperparameter optimization problems with limited iterations [43].
In this study, AUC values were used as performance evaluation metrics for hyperparameter combinations.The GS and BO-GPR strategies under 10-fold cross-validation were applied for the hyperparameter tuning of the following models.In addition, the logistic regression (LR) model, fitted with the generalized linear function (GLM) as a binomial family, required no hyperparameter tuning due to its inherent simplicity and well-defined structure.
The conditional inference tree (CTree) is a special kind of decision tree [44] which embeds a tree regression model into a well-defined conditional inference process.Therefore, easily interpretable classification results can be produced [45].The CTree included two hyperparameters for controlling the size of the tree's growth, namely the 1-p value (mincriterion) and the maximum depth of the tree (maxdepth) [46].In selecting the predefined hyperparameter values, caution was exercised, and insights from the relevant literature [47][48][49] were taken into consideration.GS was applied to search for the optimal values of maxdepth and mincriterion, which were confined to the following range: maxdepth = [1, 30]  (1) mincriterion = {0.900, 0.950, 0.990} (2) The support vector machine (SVM) is a versatile algorithm used for addressing classification and regression problems.It possesses the capability to linearly classify data while also employing kernel tricks to handle nonlinear data challenges [50].For instance, the radial basis function (RBF) kernel transforms the input space into a high-dimensional feature space, facilitating the modeling of nonlinear data [51].In this study, an SVM with a radial basis function kernel with fewer hyperparameters was used to categorize the data.The hyperparameters requiring tuning were the penalty function (C) and kernel parameters (σ).Following the recommendations of related studies [52][53][54], we applied GS to search for the best C and σ in the following predefined set: 3,5,7,9,11,13,15}  (3) σ = 2 {−15,−13,−11,−9,−7,−5,−3,−1,1,3,5,7,9,11} The extreme gradient boosting machine (XGBoost) is an extensible and integrated algorithm based on gradient boosting decision trees, which is known for its exceptional ability to push the computational power of boosting trees to new limits [55].The performance of XGBoost was highly dependent on optimizing a large number of hyperparameters, which are summarized as follows: the maximum number of boosting iterations (nrounds), the learning rate (eta), the minimum loss reduction (gamma), the minimum weight sum of instances (min child weight), the maximum depth (maxdepth), the subsample percentage (subsample), and the column-sample-by-tree ratio of subsamples (colsample bytree).Following the recommendations of related studies [56][57][58], we employed BO-GPR to search for the hyperparameters of XGBoost in the ranges presented in Table 1.The final performance of the model in the test set was measured with the AUC, accuracy, precision, recall, and F1 score.Acknowledged as the main performance metric, the AUC gave the overall model performance at each possible classification threshold.The confusion matrix was a square matrix including the True Positive (TP), True Negative (TN), False Positive (FP), and False Negative (PN), allowing the extraction of the above-mentioned one-dimensional performance metrics from it [59].
Accuracy, defined as the ratio of the number of correct predictions to the total number of predictions, was the most common measure of overall prediction performance.

Accuracy =
TP + TN TP + TN + FP + PN Precision was defined as the ratio of the number of correct positive predictions to the total number of positive predictions, reflecting the consistency of the predictions with the positive cases in the test set.
Recall, also known as sensitivity, was defined as the ratio of the number of correct positive predictions to the total number of positives, reflecting the effectiveness of the model in predicting positive cases.
The F1 score was the reconciled mean of precision and recall, responding to the association of the predicted outcome with positive cases in the test set [60].
With the default classification threshold, p > 0.50 is categorized as positive.However, this default threshold is often unsuitable for dealing with unbalanced data.To estimate the optimal threshold, we used the closest top-left method to select the point close to the upper-left corner of the ROC curve as the optimal threshold and reported the above one-dimensional performance metrics at the optimal threshold [61].

Descriptive Results
Of the 3380 children under five years of age in this study, 1342 (39.70%) had stunted growth, with the mean age being 29.73 months.Most were boys (53.11%) and received breastfeeding for a duration of ≥6 months (Table 2).Regarding maternal characteristics, most (62.59%)mothers were not employed.Approximately half of the mothers (50.41%) had received a primary education, as had 45.40% of their partners.Household and community characteristics indicated that around 16.45% of the children came from the poorest households, almost half (46.19%) did not have access to improved water sources, the majority (76.36%) resided in rural areas, and about one-third (30.86%) hailed from the Southern Region (Table 2).
The prevalence of stunting was highest in the Highlands Region (58.97%) compared to other regions, and stunting prevalence among children from the poorest households (53.60%) was almost twice as high as that of children from the richest households (25.39%).Furthermore, the results of the χ2 test and Wilcoxon rank sum test showed that variables such as children's birth order, early breastfeeding, occurrence of diarrhea and fever in the last two weeks, and the age and marital status of the mother and her partner were not significantly associated with child stunting, and thus, they were excluded from the follow-up study (Table 2).
The prevalence of stunting among children in PNG varied across provincial division, with the Southern Highlands showing the highest rates, while Manus and the National Capital District were less affected by stunting.The Highlands Region provinces, including Southern Highlands, Enga, Hela, Western Highlands, Jiwaka, Chimbu, and Eastern Highlands, also exhibited higher stunting rates compared to other provinces (Figure 2).

Feature Selection Results
Figure 3 presents the process of feature selection using the LASSO and RF-RFE methods.For LASSO, the model achieved the best AUC value (AUC: 0.669) at λ = 0.0051, resulting in the shrinkage of regression coefficients for 34 features to 0 and representing

Feature Selection Results
Figure 3 presents the process of feature selection using the LASSO and RF-RFE methods.For LASSO, the model achieved the best AUC value (AUC: 0.669) at λ = 0.0051, resulting in the shrinkage of regression coefficients for 34 features to 0 and representing approximately 59.6% of all features.On the other hand, using RF-RFE, the model attained the best AUC value (AUC: 0.672) after removing the first 27 least important features, accounting for about 47.4% of all features.
Figure 3 presents the process of feature selection using the LASSO and RF-RFE meth-ods.For LASSO, the model achieved the best AUC value (AUC: 0.669) at λ = 0.0051, resulting in the shrinkage of regression coefficients for 34 features to 0 and representing approximately 59.6% of all features.On the other hand, using RF-RFE, the model attained the best AUC value (AUC: 0.672) after removing the first 27 least important features, accounting for about 47.4% of all features.

Hyperparameter Tuning Results
Table 3 summarizes the best hyperparameters of CTree, SVM-RBF, and XGBoost models under 10-fold cross-validation using the GS or BO-GPR strategy.With the LASSO optimal feature subset, SVM-RBF demonstrated the best prediction performance in the

Hyperparameter Tuning Results
Table 3 summarizes the best hyperparameters of CTree, SVM-RBF, and XGBoost models under 10-fold cross-validation using the GS or BO-GPR strategy.With the LASSO optimal feature subset, SVM-RBF demonstrated the best prediction performance in the training set (AUC: 0.671).Furthermore, the performance of CTree, SVM-RBF, and XGBoost in the training set improved after FS.

Evaluation of the Prediction Models
Table 4 and Figure 4 summarize the final performance of LR, CTree, SVM-RBF, and XGBoost using the test set.The results indicate that XGBoost, under the LASSO FS method, provided the best prediction performance (AUC: 0.765; 95% CI: 0.714-0.819),and the model's accuracy, precision, recall, and F1 scores at the optimized threshold (0.487) were 0.728, 0.715, 0.628, and 0.669.Moreover, CTree exhibited the worst performance without using the FS method (AUC: 0.695; 95% CI: 0.639-0.750)(Table 4 and Figure 4).The final performance of all models improved after using feature selection, indicating that the FS method effectively eliminated noise or redundant information while preserving crucial features of the original model (59.6% dimensionality reduction for LASSO and 47.4% dimensionality reduction for RF-RFE).The impact of feature selection varied depending on the optimized model: for CTree and XGBoost, the performance was best with LASSO, while for LR and SVM-RBF, the performance was optimized using RF-RFE (see Table 4 and Figure 4).

Model Interpretation
SHapley Additive exPlanations (SHAP) is a feature attribution method based on a game-theoretic framework that helps reveal the decision-making process of complex "blackbox models" such as XGBoost.As mentioned above, we used the SHAP value method to explain the XGBoost prediction model under the LASSO optimal feature subset.

SHAP Summary Plots
The SHAP summary chart sorted the characteristics vertically from highest to lowest based on the mean absolute SHAP values.We selected the top 15 characteristics to illustrate their relative importance in predicting stunting in children (refer to Figure 5).Notably, living in the Highlands Region, the child's age, belonging to the wealthiest family, and having a larger or smaller birth size were identified as the top five most significant factors.
Children 2023, 10, x FOR PEER REVIEW 13 of 18 depending on the optimized model: for CTree and XGBoost, the performance was best with LASSO, while for LR and SVM-RBF, the performance was optimized using RF-RFE (see Table 4 and Figure 4).

Model Interpretation
SHapley Additive exPlanations (SHAP) is a feature attribution method based on a game-theoretic framework that helps reveal the decision-making process of complex "black-box models" such as XGBoost.As mentioned above, we used the SHAP value method to explain the XGBoost prediction model under the LASSO optimal feature subset.

SHAP Summary Plots
The SHAP summary chart sorted the characteristics vertically from highest to lowest based on the mean absolute SHAP values.We selected the top 15 characteristics to illustrate their relative importance in predicting stunting in children (refer to Figure 5).Notably, living in the Highlands Region, the child's age, belonging to the wealthiest family, and having a larger or smaller birth size were identified as the top five most significant factors.Additionally, the SHAP summary chart represents each child's features as points, which are colored according to their feature values, ranging from low (blue) to high (red).For binary feature vectors, red dots indicated the presence of the corresponding feature in the individual child.The SHAP value on the horizontal axis reflects the contribution of the feature to the model output.Higher SHAP values indicate a greater likelihood of stunting.Specifically, children from the Highlands Region, those with smaller birth sizes, or those in the poorest households had SHAP values > 0 for the corresponding characteristics, indicating a higher probability of stunting.In contrast, children from the wealthiest families who were female or of larger birth size had SHAP values < 0 for the corresponding trait, indicating a lower probability of stunting (see Figure 5).

SHAP Dependence Plot of Child's Age
To provide a more intuitive view of the relationship between feature values and the model's expected output, we constructed a dependency plot for child age (a continuous variable) versus SHAP values.The plot included points representing different individual children.The smoothed line of partial regression demonstrated a positive association between child age and SHAP values when children were ≤24 months old.After 24 months Additionally, the SHAP summary chart represents each child's features as points, which are colored according to their feature values, ranging from low (blue) to high (red).For binary feature vectors, red dots indicated the presence of the corresponding feature in the individual child.The SHAP value on the horizontal axis reflects the contribution of the feature to the model output.Higher SHAP values indicate a greater likelihood of stunting.Specifically, children from the Highlands Region, those with smaller birth sizes, or those in the poorest households had SHAP values > 0 for the corresponding characteristics, indicating a higher probability of stunting.In contrast, children from the wealthiest families who were female or of larger birth size had SHAP values < 0 for the corresponding trait, indicating a lower probability of stunting (see Figure 5).

Discussion
Based on the nationally representative PNG DHS 2016-2018 dataset, our study s gests that the LASSO-XGBoost combination had the best performance in predicting st ing among children under five years old in PNG (AUC: 0.767, 95% CI: 0.714-0.819).optimal model identified living in the Highlands Region, the age of the child, being in wealthiest household, and having a larger or smaller birth size as the top five most portant characteristics for predicting stunting in children, reflecting the complexity of causes of stunting.Critical findings of the study include the following.
Firstly, the study found that children residing in the Highlands Region were at a v high risk of stunting, and stunting was also most prevalent in the region (58.97%),wh is similar to the findings of an earlier study using the 2009-2010 PNG HIES [19].F insecurity is one of the key underlying causes of child malnutrition [34].In the Highla Region, children are extremely vulnerable to food insecurity caused by events suc extreme weather and social conflict [62,63].Long-term food deprivation [64] is associa closely with chronic malnutrition.Diets in the Highlands excessively rely on mono-fo (such as starchy foods like sweet potatoes and sago, among others) [63,65], and nutrit ally unbalanced feeding practices could also contribute to linear growth deficits in c dren [14,64].
Secondly, the study highlights the strong association between household wea child age, and stunting.Children from wealthier households face a lower risk of stunt which is potentially due to their better resilience against food insecurity [66], impro access to healthcare facilities [67], and ability to access high-protein foods [68,69].Furt more, the study observed a rapid increase in the risk of stunting in children aged 0 months, which is in line with previous cross-country studies [3].This emphasized the gency of early intervention to prevent stunting from exacerbating the cycle of deprivat

Discussion
Based on the nationally representative PNG DHS 2016-2018 dataset, our study suggests that the LASSO-XGBoost combination had the best performance in predicting stunting among children under five years old in PNG (AUC: 0.767, 95% CI: 0.714-0.819).The optimal model identified living in the Highlands Region, the age of the child, being in the wealthiest household, and having a larger or smaller birth size as the top five most important characteristics for predicting stunting in children, reflecting the complexity of the causes of stunting.Critical findings of the study include the following.
Firstly, the study found that children residing in the Highlands Region were at a very high risk of stunting, and stunting was also most prevalent in the region (58.97%), which is similar to the findings of an earlier study using the 2009-2010 PNG HIES [19].Food insecurity is one of the key underlying causes of child malnutrition [34].In the Highlands Region, children are extremely vulnerable to food insecurity caused by events such as extreme weather and social conflict [62,63].Long-term food deprivation [64] is associated closely with chronic malnutrition.Diets in the Highlands excessively rely on mono-foods (such as starchy foods like sweet potatoes and sago, among others) [63,65], and nutritionally unbalanced feeding practices could also contribute to linear growth deficits in children [14,64].
Secondly, the study highlights the strong association between household wealth, child age, and stunting.Children from wealthier households face a lower risk of stunting, which is potentially due to their better resilience against food insecurity [66], improved access to healthcare facilities [67], and ability to access high-protein foods [68,69].Furthermore, the study observed a rapid increase in the risk of stunting in children aged 0-24 months, which is in line with previous cross-country studies [3].This emphasized the urgency of early intervention to prevent stunting from exacerbating the cycle of deprivation, especially among the most vulnerable groups of children living in poverty [70].
Finally, the study underscores the significance of birth size in determining a child's growth potential.A smaller birth size is associated with a higher risk of stunting in children, while a larger birth size is a protective factor.Maternal malnutrition during pregnancy could be a potential cause of smaller birth sizes, leading to altered fetal and placental growth patterns and contributing to impaired fetal growth [71,72].Existing studies have emphasized the importance of intrauterine health in preventing stunting in children [73].Therefore, it is necessary to focus on and improve the nutritional status of pregnant women and women of childbearing age (15-49 years) in PNG.However, it is important to emphasize that exploring the relationship between birth size and stunting still requires caution due to the subjective assessment of the child's birth size by mothers.
Moreover, being female, mothers' exposure to mass media, mothers' secondary education level, and their partners' higher education level were discovered to be protective factors against stunting.Evidence from the Highlands Margins of PNG suggests that gender heterogeneity in stunting may be attributed to girls' growth strategies, which prioritize growth over maintenance to meet future reproductive potential [74].Meanwhile, mothers and their partners with high levels of education were likely to have better incomes, leading to improved nutrition for their children [75].And mothers exposed to mass media were more likely to acquire knowledge about proper modern healthcare practices and correct inappropriate attitudes [76].
In conclusion, our results show that the combination of ML and FS techniques provides a better classification of stunting.After BO-BPR hyperparameter tuning with 10-fold crossvalidation, the LASSO-XGBoost model achieved the best predictive performance compared to traditional logistic regression (LR) (AUC: 0.765; 95% CI: 0.714-0.819).Particularly, LASSO and RF-RFE facilitated efficient ML learning by removing redundant, noisy information, resulting in a substantial dimensionality reduction of 59.6% and 47.4%, respectively.Thus, we suggest prioritizing the best combination of LASSO and XGBoost when stunting in PNG children is a central concern for prediction.
This study had several important strengths.Firstly, the data were derived from the nationally representative PNG DHS 2016-2018.Secondly, the study used ML algorithms and FS techniques to make better predictions, which have not been widely used in related research in PNG and could provide lessons for researchers conducting research on similar topics in PNG and other Pacific Island countries.Nevertheless, some potential limitations remain.Firstly, the SHAP value method employed in the study provided correlation analysis but could not establish causal inferences; therefore, the interpretability of the results is still limited.Secondly, although we tried to include as comprehensive a set of variables as possible, we could not exclude residual confounding caused by unmeasured variables such as the mother's height and weight.Moreover, some of the children's information was derived from their mother's recollections (for instance, the occurrence of diarrhea and fever in the child in the last two weeks), and there may be a recollection bias.

Conclusions
Based on cross-sectional data from the nationally representative PNG DHS 2016-2018, this study used the ML algorithm with FS techniques to identify the optimal model and crucial factors for predicting stunting in children under five years of age in PNG.The results show that the combination of LASSO and XGBoost had the best predictive performance.Living in the Highlands Region, the child's age, being in the richest household, and having a larger or smaller birth size emerged as the top five important characteristics for predicting stunting.The findings emphasize the importance of early-life interventions to prevent stunting, especially for the most vulnerable groups of children in the marginalized Highlands Region.Therefore, there is an imperative for more robust public health policies and interventions aimed at enhancing maternal nutrition and disseminating accurate knowledge of modern healthcare practices to promote maternal and child health and well-being in PNG.

FOR PEER REVIEW 10 of 18 Figure 2 .
Figure 2. Spatial distribution of stunting rates for children under five years of age by provinciallevel divisions in Papua New Guinea; 2016-2018 PNG DHS.

Figure 2 .
Figure 2. Spatial distribution of stunting rates for children under five years of age by provincial-level divisions in Papua New Guinea; 2016-2018 PNG DHS.

Figure 3 .
Figure 3. Feature selection process for LASSO and RF-RFE.(a) feature selection process for LASSO; (b) feature selection process for RF-RFE.

Figure 3 .
Figure 3. Feature selection process for LASSO and RF-RFE.(a) feature selection process for LASSO; (b) feature selection process for RF-RFE.

Figure 4 .
Figure 4. ROC curve and AUC performance of the prediction models using the test set.

Figure 4 .
Figure 4. ROC curve and AUC performance of the prediction models using the test set.

Figure 5 .
Figure 5. SHAP summary plot with top 15 contributing features for XGBoost models.

Figure 5 .
Figure 5. SHAP summary plot with top 15 contributing features for XGBoost models.

3. 5 . 2 .
SHAP Dependence Plot of Child's Age To provide a more intuitive view of the relationship between feature values and the model's expected output, we constructed a dependency plot for child age (a continuous variable) versus SHAP values.The plot included points representing different individual children.The smoothed line of partial regression demonstrated a positive association between child age and SHAP values when children were ≤24 months old.After 24 months of age, SHAP values stabilized and remained positive for the vast majority of children (Figure 6).Children 2023, 10, x FOR PEER REVIEW 14 oof age, SHAP values stabilized and remained positive for the vast majority of child (Figure6).

Table 2 .
Prevalence of stunting in children under 5 in Papua New Guinea by characteristics; PNG DHS 2016-2018.

Table 3 .
Optimal value of each hyperparameter searched by the optimization strategy.

Table 4 .
Performance summary of the prediction models.

Table 4 .
Performance summary of the prediction models.