Feature Importance Analysis for Postural Deformity Detection System Using Explainable Predictive Modeling Technique

: This study aimed to analyze feature importance by applying explainable artiﬁcial intelligence (XAI) to postural deformity parameters extracted from a computer vision-based posture analysis system (CVPAS). Overall, 140 participants were screened for CVPAS and enrolled. The main data analyzed were shoulder height difference (SHD), wrist height difference (WHD), and pelvic height difference (PHD) extracted using a CVPAS. Standing X-ray imaging and radiographic assessments were performed. Predictive modeling was implemented with XGBoost, random forest regressor, and logistic regression using XAI techniques for global and local feature analyses. Correlation analysis was performed between radiographic assessment and AI evaluation for PHD, SHD, and Cobb angle. Main global features affecting scoliosis were analyzed in the order of importance for PHD (0.18) and ankle height difference (0.06) in predictive modeling. Outstanding local features were PHD, WHD, and KHD that predominantly contributed to the increase in the probability of scoliosis, and the prediction probability of scoliosis was 94%. When the PHD was >3 mm, the probability of scoliosis increased sharply to 85.3%. The paired t -test result for AI and radiographic assessments showed that the SHD, Cobb angle, and scoliosis probability were signiﬁcant ( p < 0.05). Feature importance analysis using XAI to postural deformity parameters extracted from a CVPAS is a useful clinical decision support system for the early detection of posture deformities. PHD was a major parameter for both global and local analyses, and 3 mm was a threshold for signiﬁcantly increasing the probability of local interpretation of each participant and the prediction of postural deformation, which leads to the prediction of participant-speciﬁc scoliosis. feature analyses were performed through the predictive modeling process using the explainable artiﬁcial intelligence technique. Radiographic assessment results and major parameters acquired from computer vision-based posture analysis systems were evaluated, and correlation analysis was also performed.


Introduction
Normal spinal posture is essential for maintaining spine health and biomechanical function including longevity. However, changes in physiological spinal curvature occur through natural aging or pathological processes because of various causes [1]. Early functional changes gradually accelerate, leading to irreversible and structural spinal deformation. Functional or structural spinal deformation affects spinal posture balance, causing uneven height differences between the shoulders and pelvis, and changes the axis of normal weight distribution causing deformation that protrudes unilaterally from the coronary plane of the spine. This imbalance in weight distribution further increases the difference in the asymmetric height between the shoulder and pelvis, causing structural deformation and pain in the spinal curve where pressure is concentrated [2,3]. The normal physiological spinal curvature can be restored through functional changes by removing or improving the factors that cause posture asymmetry. Maintaining spinal health is very important to detect posture imbalance before spinal deformation progresses irreversibly and results in structural spinal deformation. Specifically, early diagnosis of scoliosis should include not only structural scoliosis but also functional posture changes, and the normal posture should be restored by mitigating the cause of spinal deformation.
Some parameters are related to scoliotic detection. The parameters can be extracted through radiographic assessment and screening. Generally, the Cobb angle, age, weight, spinal vertebral rotation angle, pelvic height difference (PHD), and shoulder height difference (SHD) have been used for scoliosis diagnosis and scoring [4][5][6][7]. The presence, degree, and severity of scoliosis can be determined by identifying and calculating some characteristic features among various parameters.
The presence, degree, or severity of scoliosis is basically evaluated by measuring the curvature of the spine through X-ray imaging [8]. However, other studies have been conducted to evaluate the structural deformity by various methods than direct use of X-ray imaging [9,10]. Various predictive investigations of scoliosis have been reported in recent artificial intelligence (AI) research trends [11][12][13][14]. Alharbi et al. studied scoliosis prediction by using deep learning and calculating the Cobb angle (average error of 2.9 • ) using 800 X-ray images to replace manual Cobb angle measurement through radiographic assessment [11]. Yang et al. conducted deep learning research related to the screening system using non-ionizing radiation. They developed a machine learning algorithm that can identify cases with a curve ≥20 • using 3640 unclothed back images [14].
Adolescent idiopathic scoliosis (AIS) is the most common spinal disease in adolescents, with a worldwide prevalence of 0.5-5.2% [2,15,16]. The aforementioned studies used artificial intelligence to develop high levels of prediction accuracy and performance but could not explain the inference process of the predictive models. Thus, it is necessary to apply the explainable artificial intelligence (XAI) technique that can explain the inference of the results like a white box given the black box characteristics of the machine learning model [17].
Spinal deformity diagnosis requires X-ray imaging to identify or monitor the disorder. During this process, the patients are exposed to radiation [8]. However, mild scoliosis can be observed without the use of regular radiation, and moderate-to-severe scoliosis can be detected at an early stage, providing an opportunity for treatment and posture correction using the detection system of postural deformity utilizing non-ionizing radiation. In this process, the aforementioned method using artificial intelligence can be applied. Various artificial intelligence studies have been published to quantitatively detect spinal deformities or scoliosis using only the radiographic images, but this study is characterized as an artificial intelligence study using feature importance analysis based on parameters extracted from a computer vision-based posture analysis system (Table 1). In this study, postural deformity as a result of prediction probability was derived for the early detection of suspected scoliosis through artificial intelligence models based on skeleton point information extracted from the computer vision-based posture analysis system (CVPAS) using non-ionizing radiation. Moreover, contributions of prognostic factors to the results were analyzed by the inference process for the prediction outcomes.

Materials and Methods
In this section, the acquisition of dataset, the artificial intelligence models used, explainers, and evaluation method with a research diagram are described.

Data Acquisition and Participant Characteristics
First, 140 participants who visited the institutions for scoliosis examination from January 2020 to January 2021 were included. In these participants, the following difference metrics in upper and low body skeletal points were measured: shoulder height difference, elbow height difference (EHD), wrist height difference (WHD), pelvic height difference, knee height difference (KHD), and ankle height difference (AHD). A computer vision-based posture analysis system (PA3017; Driom, Incheon, South Korea) was used, and scoliosis diagnostic analysis results were obtained ( Figure 1). The Cobb angle, shoulder height difference, and pelvic height difference were radiographically measured in standing body X-ray images. In this study, postural deformity as a result of prediction probability was derived for the early detection of suspected scoliosis through artificial intelligence models based on skeleton point information extracted from the computer vision-based posture analysis system (CVPAS) using non-ionizing radiation. Moreover, contributions of prognostic factors to the results were analyzed by the inference process for the prediction outcomes.

Materials and Methods
In this section, the acquisition of dataset, the artificial intelligence models used, explainers, and evaluation method with a research diagram are described.

Data Acquisition and Participant Characteristics
First, 140 participants who visited the institutions for scoliosis examination from January 2020 to January 2021 were included. In these participants, the following difference metrics in upper and low body skeletal points were measured: shoulder height difference, elbow height difference (EHD), wrist height difference (WHD), pelvic height difference, knee height difference (KHD), and ankle height difference (AHD). A computer visionbased posture analysis system (PA3017; Driom, Incheon, South Korea) was used, and scoliosis diagnostic analysis results were obtained ( Figure 1). The Cobb angle, shoulder height difference, and pelvic height difference were radiographically measured in standing body X-ray images.

Predictive Models and Model Explainers
Second, three machine learning models with two model explainers for predictive interpretation were used in this study. Extreme gradient boosting (XGBoost) is a decision tree-based ensemble machine learning algorithm using a gradient boosting framework. Gradient descent is used as the boosting method in the ensemble model [18]. Herein, a

Predictive Models and Model Explainers
Second, three machine learning models with two model explainers for predictive interpretation were used in this study. Extreme gradient boosting (XGBoost) is a decision tree-based ensemble machine learning algorithm using a gradient boosting framework. Gradient descent is used as the boosting method in the ensemble model [18]. Herein, a tree-based decision branch was used in the schematization. Random forest regressor is an ensemble method for learning multiple decision trees. Random forest can rank the importance of parameters (shoulder height difference, pelvic height difference, ankle height difference, etc.) for the predictive outcome to cause scoliosis [19]. Logistic regression was used to create a predictive model for the outcome by functionalizing the relationship between the dependent variable and the independent variable [20]. Briefly, as a method of explaining the dependent variable as a linear combination of independent variables, given input data, the result of the corresponding data was divided into a specific classification, that is, whether scoliosis will appear or not.
Shapley additive explanations (SHAP) originate from the method of calculating the contribution of each player to the outcome in game theory and can explain the contribution of prognostic factors affecting the prediction result in a machine learning model [21]. Local interpretable model-agnostic explanation (LIME) is a method that enables local interpretation of factors that contribute to the results of the machine learning model. Moreover, by calculating the contribution of individual parameters in the local data space, it is a method for interpretation with improved accuracy for each local case [22].

Research Process for Predictive Modeling
Third, XGBoost classifier, random forest regressor, and logistic regression were used for scoliosis predictive modeling ( Figure 2). A 5-fold cross-validation method was used to maximize the use of data available for training and model testing. Relationships between features were evaluated using mutual information (MI) metrics. Shapley additive explanations and local interpretable model-agnostic explanation explainers were also used to implement an artificial intelligence model that can be explained in black box AI. Shapley additive explanations enable global interpretation of parameters for predicting scoliosis, and local interpretable model-agnostic explanations allow local interpretation for individual participants. Consequently, model visualization, feature analysis for parameters, and scoliosis prediction probability were analyzed. Paired t-test analysis was also performed on pelvic height difference, shoulder height difference, and scoliosis outcomes among the parameters for the Cobb angle, pelvic height difference, and shoulder height difference obtained through radiographic assessment and parameters obtained using a CVPAS with significance at p-value < 0.05. Python 3.8.3, scikit-learn 0.23.1 for predictive modeling, SHAP 0.36.0, and LIME 0.  Research diagram for scoliosis predictive modeling. Global and local feature analyses were performed through the predictive modeling process using the explainable artificial intelligence technique. Radiographic assessment results and major parameters acquired from computer visionbased posture analysis systems were evaluated, and correlation analysis was also performed.

Results
In this section, the results of predictive modeling performance, feature analysis for the parameters, and global and local interpretation are presented.

Radiographic Assessment and Statistical Analysis
First, the results of participant characteristics analysis are shown in Table 2. For the Research diagram for scoliosis predictive modeling. Global and local feature analyses were performed through the predictive modeling process using the explainable artificial intelligence technique. Radiographic assessment results and major parameters acquired from computer visionbased posture analysis systems were evaluated, and correlation analysis was also performed.

Results
In this section, the results of predictive modeling performance, feature analysis for the parameters, and global and local interpretation are presented.

Radiographic Assessment and Statistical Analysis
First, the results of participant characteristics analysis are shown in Table 2. For the scoliosis curve, thoracic, thoracolumbar, and lumbar types were combined. When the Cobb angle was <10 • , the normal type was indicated. The radiographic assessment results of the participants were analyzed to obtain the Cobb angle (mean 6.16 • ± 8.50), SHD (mean 1.12 ± 3.27 mm), and PHD (mean 2.89 ± 4.22 mm). The values of the parameters for predictive modeling are continuous variables. Moreover, the results obtained through the three variables and the posture analysis system, and the results obtained through the paired t-test, are shown in Figure 3. Significant parameters were SHD, Cobb angle, and scoliosis (p < 0.001).

Predictive Modeling Performance
Second, the performance of each machine learning model used for predictive modeling is shown in Table 3. The maximum scores of the training and test split sets due to kfold cross-validation are described. The mean accuracy, sensitivity, specificity, and area under the curve (AUC) of the three models were 0.79 ± 0.00, 0.78 ± 0.05, 0.80 ± 0.02, and 0.77 ± 0.11, respectively.

Predictive Modeling Performance
Second, the performance of each machine learning model used for predictive modeling is shown in Table 3. The maximum scores of the training and test split sets due to k-fold cross-validation are described. The mean accuracy, sensitivity, specificity, and area under the curve (AUC) of the three models were 0.79 ± 0.00, 0.78 ± 0.05, 0.80 ± 0.02, and 0.77 ± 0.11, respectively.

Model Visualization for the Predictive Model
Third, results of the visual analysis of the scoliosis predictive model using the aforementioned parameters are presented in Figure 4. The model architecture was visualized by charting the decision-making process for XGBoost. It can help users better understand the flow of judgment going on inside the model. The metric is important not only to describe the performance of the model, but to understand the thresholds of each feature and the output values by each node. In detail, XGBoost classifier considers ankle height difference when PHD (mm) is greater than the threshold value of 1.5 mm and develops the scoliosis probability based on the PHD value of 2.5 mm (the most right outer branch and leaf). It also indicated that the tree was constructed using PHD, SHD, ankle height difference, weight, height, age, and their respective threshold values for scoliosis prediction.
the flow of judgment going on inside the model. The metric is important not only to describe the performance of the model, but to understand the thresholds of each feature and the output values by each node. In detail, XGBoost classifier considers ankle height difference when PHD (mm) is greater than the threshold value of 1.5 mm and develops the scoliosis probability based on the PHD value of 2.5 mm (the most right outer branch and leaf). It also indicated that the tree was constructed using PHD, SHD, ankle height difference, weight, height, age, and their respective threshold values for scoliosis prediction.

Feature Analysis for the Parameters and Global Interpretation
Fourth, in order to identify relevant features of the model in the dataset, it is necessary to remove less important features that do not significantly contribute to the occurring spinal deformity. Key feature selection is the process of selecting the features that contribute the most to the output. Thus, feature importance by Shapley additive explanations gives a score for each feature in the data, and the higher the score, the more important or relevant the feature to spinal deformity. PHD, ankle height difference, and elbow height difference were analyzed as factors affecting scoliosis (in this order) in predictive modeling ( Figure 5A). Red and blue bars mean positive and negative effects, respectively. That is, the greater the difference in PHD, the greater the probability of scoliosis diagnosis, and the younger the age, the greater the predictive probability of scoliosis. However, this importance cannot be applied to each participant. Meanwhile, the contribution was analyzed in the order of PHD, age, knee height difference, and wrist height difference in a specific

Feature Analysis for the Parameters and Global Interpretation
Fourth, in order to identify relevant features of the model in the dataset, it is necessary to remove less important features that do not significantly contribute to the occurring spinal deformity. Key feature selection is the process of selecting the features that contribute the most to the output. Thus, feature importance by Shapley additive explanations gives a score for each feature in the data, and the higher the score, the more important or relevant the feature to spinal deformity. PHD, ankle height difference, and elbow height difference were analyzed as factors affecting scoliosis (in this order) in predictive modeling ( Figure 5A). Red and blue bars mean positive and negative effects, respectively. That is, the greater the difference in PHD, the greater the probability of scoliosis diagnosis, and the younger the age, the greater the predictive probability of scoliosis. However, this importance cannot be applied to each participant. Meanwhile, the contribution was analyzed in the order of PHD, age, knee height difference, and wrist height difference in a specific participant (participant #15) with a predictive probability of scoliosis of 90%. That is, the interpretation indicates that the order of factors is different from Figure 5A (Figure 5B). participant (participant #15) with a predictive probability of scoliosis of 90%. That is, the interpretation indicates that the order of factors is different from Figure 5A ( Figure 5B).

Scoliosis Prediction and Local Interpretation
Fifth, scoliosis prediction results and feature importance for each participant are indicated in Figure 6. Participant #120 (female of 25 years old) was diagnosed with lumbar scoliosis with a Cobb angle of 11° on radiographic assessment. For this participant, the predictive probability of scoliosis was 94%, and PHD, wrist height difference, and knee height difference (green bars: positive effect) analyzed with high height differences were examined as important factors that predominantly increased the probability of scoliosis ( Figure 6A). Conversely, in a 20-year-old woman with a Cobb angle of 0° and healthy spine curve, the predictive probability of scoliosis was 9%, and PHD and wrist difference of low height (red bars: negative effect) were analyzed as important factors that contributed to lowering the probability of occurrence of scoliosis ( Figure 6B).

Scoliosis Prediction and Local Interpretation
Fifth, scoliosis prediction results and feature importance for each participant are indicated in Figure 6. Participant #120 (female of 25 years old) was diagnosed with lumbar scoliosis with a Cobb angle of 11 • on radiographic assessment. For this participant, the predictive probability of scoliosis was 94%, and PHD, wrist height difference, and knee height difference (green bars: positive effect) analyzed with high height differences were examined as important factors that predominantly increased the probability of scoliosis ( Figure 6A). Conversely, in a 20-year-old woman with a Cobb angle of 0 • and healthy spine curve, the predictive probability of scoliosis was 9%, and PHD and wrist difference of low height (red bars: negative effect) were analyzed as important factors that contributed to lowering the probability of occurrence of scoliosis ( Figure 6B).
Conversely, the distribution of predictions using detailed numerical values, in which PHD was one of the dominant parameters that affected scoliosis, was analyzed ( Figure 7). If the PHD is <2 mm, the prediction probability is small at 24.6%. However, if the PHD is >3 mm, the probability of diagnosing scoliosis sharply increases to 85.3%.
Finally, mutual information (MI) metrics were evaluated in Table 4. MI is a measure of the similarity between two features in the same dataset. In Table 4, PHD (0.22) and knee height difference (0.09) are the most dependent factors, but wrist height difference (0.00) and ankle height difference (0.00) are independent factors. This means that PHD and knee height difference are related to the lower extremities in body parts. However, age and weight factors are less dependent. They relate to the output of spinal deformities in the same dataset.  Conversely, the distribution of predictions using detailed numerical values, in which PHD was one of the dominant parameters that affected scoliosis, was analyzed ( Figure 7). If the PHD is <2 mm, the prediction probability is small at 24.6%. However, if the PHD is >3 mm, the probability of diagnosing scoliosis sharply increases to 85.3%. Finally, mutual information (MI) metrics were evaluated in Table 4. MI is a measure of the similarity between two features in the same dataset. In Table 4, PHD (0.22) and knee height difference (0.09) are the most dependent factors, but wrist height difference (0.00) and ankle height difference (0.00) are independent factors. This means that PHD and knee height difference are related to the lower extremities in body parts. However, age and

Discussion
The clinical decision support system (CDSS) is leading the paradigm shift in modern medical diagnosis and treatment [23]. Recently, in various studies for screening scoliosis, machine learning or deep learning methods have been implemented in radiographic image analysis to classify and predict an early diagnosis of changes in the spinal curvature [13,14]. Tajdari et al. reported that CDSS applying mechanical neural network modeling to the spinal model of each patient was very useful for the early detection of AIS [13,14].
With further improvements in CDSS development and utilization, more studies are being conducted for the development and application of parameters such as age, height, weight, gender, gait features, electromyography, and mechanical characteristics of the spine for screening prediction [10,13,24,25].
In this study, postural parameters for scoliosis detection such as PHD, SHD, and WHD were evaluated using an AI prediction algorithm combined with CVPAS and CDSS, and the AI model predicted the scoliosis of each participant and analyzed features related to scoliosis.

Scoliosis Screening System Combined with AI
By inferring the causal relationship between the results of predictive modeling and parameters, this study is different from studies that combine physical information with adapted participant-specific skeletal points and their difference, and AI was provided by a CVPAS [11,14] in Table 5. However, if three-dimensional (3D) image information can be extracted in addition to Cobb angle measurement and SHD and PHD obtained from the scoliosis screening system, it will be possible to further improve both geometrical and analytic accuracy. For example, Pasha et al. used 3D spinal alignment (vertebral positions and rotations) data, and Tajdari et al. used participant-specific spinal bone geometry using surface registration [12,13]. When using 3D images for the input dataset, AI models that process tabular data should be replaced with neural network-based models. In this study, we tried to analyze postural features such as PHD, SHD, and WHD using the explainable artificial intelligence algorithm, and to provide an explainability for the prediction results using the SHAP and local interpretable model-agnostic explanation methods for each participant's spinal deformity probability (Table 5).

Use of Explainable Artificial Intelligence for In-Depth Predictive Modeling
AI models have the limitation of being a black box, as it does not reveal information on the processes and mechanisms [17]. Thus, the interpretability of AI models was introduced [17]. However, there is a trade-off between model interpretability and model accuracy [26]. For example, rule-based learning and logistic regression have lower model accuracy than deep learning but higher model interpretability [26]. Meanwhile, the accuracy of the prediction result may vary depending on the bias of the input dataset because the output is predicted based on a model that has been trained and established according to the characteristics of the input dataset [26]. The accuracy presented in this study used logistic regression and random forest, as shown in Table 3. The accuracy of the model is relatively lower than that of deep learning models that show higher accuracy performance. However, we used models with better explanatory capabilities to evaluate the interpretation ability of CDSS combining CVPAS and AI [26]. Ultimately, the optimal model-specific analysis that can improve both model interpretability and accuracy is required in developing an AI algorithm through predictive modeling. Therefore, our study requires an optimized modeling approach adapted to improve both model interpretability and accuracy in subsequent studies.

Global vs. Local Interpretation for the Parameters
In Figure 5, global feature importance was analyzed using SHAP. Briefly, it is optimal to list representative values for parameters sequentially through the generalization of parameters for all participants. However, this is an analysis of the average effect on all participants, and results do not apply to each participant. Therefore, to overcome this limitation, the local interpretable model-agnostic explanations analysis method was used ( Figure 6). LIME enables local interpretation for individual participants, enabling the analysis of participant-specific parameters. Ultimately, the predictive probabilities of AI models should be more accurate in each case than their average predictive accuracy. Meanwhile, the use of the LIME model causes a stability issue. When the model is employed recurrently under the same conditions it may return different results. However, we did not conduct further studies to improve the stability of LIME. Note that complementary indices such as the variables stability index (VSI) and coefficients stability index (CSI) can be used to evaluate stability improvement for increasing performance of the prediction system [27].

Limitations of Dataset
How much of the dataset can we use to maximize the accuracy of AI models? It has been a critical subject of much debate so far. Although we used 140 patients' data in this study, we intuitively know that the prediction accuracy of the model can be improved by using a larger number of datasets with various deviations [28]. One of the reasons for the difficulty in using AI applications through predictive modeling in clinics includes the lack of dataset [29]. The distribution of the Cobb angle to predict scoliosis in the study participants was 6.16 • ± 8.50 (Table 2). Therefore, the dataset cannot be used as a training dataset to analyze mild and severe scoliosis (Cobb angle > 40 • ). Therefore, the results of this study can be applied to normal and mild scoliosis predictive modeling. In addition, this study used a clinical tabular dataset to reveal the correlation between scoliosis-related parameters by extracting participant characteristics through the CVPAS. Therefore, it was difficult to apply data augmentation using image rotation and transformation adapting in deep learning applications using images [30]. However, the data augmentation studies related to parameters can be considered in the next study. The use of cross validation to improve accuracy using insufficient datasets will be an example of model optimization.

Conclusions
Analyzing the feature importance using explainable artificial intelligence (XAI) to postural deformity parameters extracted from a computer vision-based posture analysis system (CVPAS) is a useful prediction method for the early detection of postural deformities.
It was found that the pelvic height difference (PHD) was the most influential parameter for both global and local analyses, and 3 mm was the threshold to significantly increase the probability of local interpretation of each participant and the prediction of postural deformation, which leads to the prediction of participant-specific scoliosis. Complementary indices such as the variables stability index (VSI) and coefficients stability index (CSI) can also be evaluated regarding stability improvement of explainer models in a further study.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/ 10.3390/app12020925/s1, the code for this study is attached.