Evaluation of Feature Selection Methods on Psychosocial Education Data Using Additive Ratio Assessment

: Artiﬁcial intelligence, particularly machine learning, is the fastest-growing research trend in educational ﬁelds. Machine learning shows an impressive performance in many prediction models, including psychosocial education. The capability of machine learning to discover hidden patterns in large datasets encourages researchers to invent data with high-dimensional features. In contrast, not all features are needed by machine learning, and in many cases, high-dimensional features decrease the performance of machine learning. The feature selection method is one of the appropriate approaches to reducing the features to ensure machine learning works efﬁciently. Various selection methods have been proposed, but research to determine the essential subset feature in psychosocial education has not been established thus far. This research investigated and proposed methods to determine the best feature selection method in the domain of psychosocial education. We used a multi-criteria decision system (MCDM) approach with Additive Ratio Assessment (ARAS) to rank seven feature selection methods. The proposed model evaluated the best feature selection method using nine criteria from the performance metrics provided by machine learning. The experimental results showed that the ARAS is promising for evaluating and recommending the best feature selection method for psychosocial education data using the teacher’s psychosocial risk levels dataset.


Introduction
Psychosocial education is multidisciplinary and covers a vast field of study. Therefore, it is not surprising that research in psychosocial education encompasses an abundance of environments and features that are logically expected to be linked to the problem-solving of educational quality improvements. Research from various perspectives, such as personal environment [1], family [2], nutrition [3], and physical activities [4], has been conducted to get an overview of the various psychosocial relationships in education. Accordingly, research linked to psychosocial education is categorized as one of the most active in education. Indeed, the search using the keyword "psychosocial education" in Google Scholar shows 212,000 results of research published between 2017 and 2021.
On the other hand, the success of artificial intelligence and big data influences decisionmaking perspectives, particularly those based on predictive problems. Big data can effectively handle more large-scale amounts, more complex varieties, and higher data dimensions [5]. Meanwhile, artificial intelligence, especially machine learning, significantly improves the quality of decision models [6,7]. These two factors encourage researchers to collect more data with massive features.
Theoretically, the more data that are collected, the more information that is obtained. The more information obtained, the better the prediction will be generated to be. However, the increase in the number of variables and the volume of data impacts data sparsity, especially if the data quality is poor. The increase in sparsity makes it much more difficult to find data representative of the population. Furthermore, it makes machine learning challenging to generalize to the domain problem. The vague generalization will cause machine learning to lose its ability to adapt to new problems [8,9].
Instead of thrusting all features into machine learning, performing input feature optimization is often more efficient and effective. Feature selection can eliminate all features that are irrelevant to the prediction target. There have been various methods of selecting a feature that has been proposed and proven to impact machine learning performance. With many feature selection methodologies and different approaches in each method, it is relatively easy to raise a question about which method can give the optimum and effective results in machine learning, especially regarding the psychosocial education problem.
Hence, this paper proposed a methodology to evaluate the best feature selection method in the domain of psychosocial education. The evaluation was performed using a decision model approach that utilized multi-criteria decision making (MCDM). Furthermore, additive ratio assessment (ARAS) was adopted to evaluate and rank the best feature selection method. The evaluation and ranking used the metrics from the machine learning classification performance on the teacher's psychosocial risk level dataset.

Related Work
Feature selection is one of the critical stages in machine learning modeling, and the relevant feature has implications for better stability, robustness, and generalization of machine learning [10]. The feature selection method can be divided into three approaches [11][12][13]: filtering, wrapper, and embedded method.
Moorthy and Gandhi [14] previously conducted research using the filtering method. They optimized medical data using feature selection techniques for classification problems. They combined analysis of variance (ANOVA) and whale optimization (WO) to give a better result for SVM and k-NN classifiers than the one without ANOVA-WO. Ding and Li [15] also conducted a similar study identifying mitochondrial proteins in malaria by combining ANOVA and incremental feature selection (IFS) methods to find the most optimal feature. The proposed model achieved 97.1% accuracy compared to 92.0% on the comparison model. Next, Utama [16] performed feature selection using the mutual information (MI) model to predict the airline's tweet sentiment analysis. The feature selection made contributions to the classifier improvement.
Similarly, the wrapper method also gives promising results. Richhariya et al. [17] proposed a Universum support vector machine based on the recursive feature elimination (USVM-RFE) method to diagnose Alzheimer's. Feature selection was performed on the MRI data of brain tissue, and the classification using USVM-RFE showed better results than the one using SVR-RFE. The implementation of RFE was also done in the study [18], where RFE-SVM was used to determine the best feature among the various heart rate variability (HRV) data. The study showed that RFE-SVM could identify the HRV feature and detect the stress level better.
The approach using the embedded method has been widely used. Liu et al. [19] implemented feature selection using the embedded method. The implementation was performed during a cyberattack on the Internet of Things (IoT) data. The accuracy of the proposed method was relatively comparable to that of the comparison model. However, it was better in training speed, 1000 times faster than the overall features model. The implementation of the embedded method as the feature selection was also conducted by Loscalzo et al. [20]. Feature selection was used to remove unneeded input in robotic sensors. The paper showed that the embedding methodology significantly reduces unimportant sensors. Lastly, Liu et al. [21] compared embedding methodology to the others, such as Chi-Square, F-Statistic, and Gini Index. The experiment showed that the weighted Gini Index (WGI) method was better than the other methodologies on the data with limited features.
Given the importance of choosing a suitable feature selection method for the data characteristics of domain problems, selecting the best feature selection method is quite challenging. There are various techniques for selecting feature selection methods, one of which is the decision system model approach. Kou et al. [22] conducted a study to select the best subset feature for a text classification case. The study compared several models from the MCDM, such as TOPSIS, GRA, WSM, VIKOR, and PROMOTHEE. The results showed that PROMOTHEE was better for evaluation models in the text-based classification case than other models. Hasemi et al. [23] proposed the EFS-MCDM method to determine the best feature on the computer network dataset. The features ranking in the EFS-MCDM delivered more optimal and efficient results in the measurement of accuracy, f-score, and run-time algorithm compared to other methods. Similarly, Singh [24] implemented TOPSIS to select features in the network traffic dataset. The research concluded that the classification model with the TOPSIS-based feature subset had the same accuracy yet much lower computation time.
Despite all the studies conducted on selecting existing feature selection methods so far, to the best of the authors' knowledge, there has been no study comparing and evaluating the best feature selection method to be implemented in psychosocial education. Previous studies on psychosocial education only implemented machine learning without extensive analysis of the used features.

Research Contribution
Based on the knowledge gaps derived from the previous studies, this paper would advance the body of knowledge about the feature selection method in two primary contributions:

1.
This paper provides a systematic model for determining the best feature selection method using an adapted additive ratio assessment model [24]. Specifically, the selection of the feature selection method is implemented in the psychosocial education dataset.

2.
This paper offers a comprehensive study and evaluation by comparing the performance of machine learning in every feature selection method. ARAS used the performance metrics from machine learning as criteria in determining the best feature selection method.

Artificial Intelligence Research on Psychosocial Education
Nowadays, the research in the education field focuses not only on the academic aspects, such as academic achievement, graduation level, academic grading, and teaching methods, but also on non-academic aspects, such as community relationships [25] and psychosocial. As such, the non-academic aspect also influences the quality of education [26][27][28].
On the other hand, the flourishing of research in artificial intelligence has made an impressive contribution to the psychosocial education field. Numerous artificial intelligencebased studies have successfully revealed psychosocial phenomena that influence education development. In a study conducted by Navarro [29], artificial intelligence was successfully used to predict the link between the condition of the environment and educators' stress levels. In addition, the research successfully interviewed and provided 4890 data points with 118 features used in predicting the level of stress on the educators. An extensive amount of data and high-dimensional features in the study indicated that psychosocial research is essential and exciting to be carried out.

Feature Selection Methods
Real-world problems are often represented by extensive data collection and highdimensional features. Occasionally, existing features may not directly relate to the target problems that need to be solved [30,31]. Under such circumstances, the selection of features becomes critical. Selecting the right features makes it possible to improve model performance and efficiency in the computation process [32,33]. Three approaches are available to select features. The first approach, the filtering method, performs a selected subset of features based on the characteristics of the feature itself. The best feature is obtained from the statistical analysis of each feature with other features or target data. Next, the wrapper method uses machine learning to select the best data subsets for analysis. The wrapper method uses machine learning to reconstruct the feature subset and tests it using statistical modeling. The third approach, the embedded method, uses the same principle as the wrapper method, however, in evaluating the feature subset by analyzing the performance of machine learning.
Next, this section will briefly describe eight feature selection methods evaluated in this paper. There are three filtering methods: analyzing variance, mutual information, and chi-square; the exhaustive search feature is a wrapped method; embedding random forest, Lasso, and recursive feature elimination are embedded methods. Those methods would be compared to the models of machine learning using the baseline feature.

ANOVA
ANOVA is a statistical analysis used to calculate the distance of difference (variance) between two clusters [34]. ANOVA uses the f-ratio to calculate the magnitude of every feature and target class. Magnitude values above the f-ratio will be retained, and others will be discarded. In an ANOVA with class k, the variance among classes is defined as follows [35]: where n i is the value discovered from the calculation on the i-th class, x i is the mean of the i-th class, and x is the mean of all classes; the class variance is defined as follows: Then, the f-ratio is calculated based on the degree of the two variances:

Chi-Square
Chi-square is a statistical method that is widely used for calculating the correlation between two variables [36][37][38]. The implementation of Chi-square as the method to select subset features in machine learning can be done by calculating the dependency level of each feature towards the target data [39,40]. If n is the observed frequency and µ is the expected frequency, then the Chi-square (X 2 ) for a feature with a number of f and class C is defined as follows:

Mutual Information (MI)
MI is used to calculate the distance of random vectors between clusters [41,42]. Mutual information looks for the similarity value between the distribution of probability P(X, Y) and product of entropy P(X)P(Y) [43]. Mutual information between two random vectors X and Y is defined as follows: In feature selection problems, the implementation of mutual information was used to calculate the information value of how significant the contribution of a feature is towards the prediction of the target class [44][45][46]. Mutual information for feature set S and m feature, which have a large dependence on the target class C, is defined as follows: 3.1.6. Exhaustive Search Feature (EFS) In the EFS method, the algorithm performance is obtained by evaluating the existing features in all possible combinations. The feature subset with the highest performance will be selected [47,48]. EFS works by finding the value of validity (P, S), assessing the entire subset of candidate feature S for a whole solution to a problem P. The result is obtained from Output (P, S), in which the entire values of S are suitable for the problem P. The EFS method is a greedy algorithm, as it uses a brute force approach to find the best possible feature subset. Due to its exhaustive nature, ESF usually requires large amounts of resources.

Embedding Random Forest (ERF)
ERF is an ensemble method to reconstruct the average output of an individual tree [49]. The recursive approach is needed to find the best value from the feature subset during the elimination process, especially for highly correlated features [50]. Evaluating the high correlation can be done using the mean decreasing impurity approach. The Gini Index is one of the most popular measures of mean decreasing impurities, and it is defined as follows: 3.1.8. Lasso The least absolute shrinkage and selection operator (Lasso) is one of the shrinkage techniques. Lasso selects the variables by minimizing the number of squared errors using penalty regularization [49,51]. Shrinkage regression is carried out towards zero along with the increase in the value of the lambda (λ) parameter used to control the number of shrinkages [52]. Lasso is defined as follows: RFE is a feature selection method that works iteratively to rank features' importance [50]. In minimizing computational resources, some approaches eliminate instead of one by one but based on a subset of features [53]. An analysis and elimination process is performed in each iteration on the feature subsets with low relevance values. Two components of RFE are the number of features and the algorithm used to analyze the performance of the feature subsets. Generally, the iteration procedure of the RFE is performed as follows [54]: (1). Train each feature subset with a classifier, (2). Regarding the ranking of the feature subsets, calculate each feature subset's ranking, (3). Removing the feature subset that has low significance.

ARAS: Decision System Approach for the Feature Evaluation Method
Additive ratio assessment (ARAS) is one of the MCDM modeling techniques. ARAS is a method that relies on the intuitive principle that the best solution must have the largest ratio. Ranking using the ARAS method is performed by comparing the value of each criterion on each alternative by looking at its weight to obtain the ideal alternative [55,56].
The ARAS method utilizes a function value that determines the complexity of feasible alternatives. The ARAS method was directly proportional to the values and weights of the main criteria considered to determine the best alternative. ARAS is based on the argument that complex problems can be understood simply by using relative comparisons. In ARAS, the ratio of the sum of normalized and weighted criteria values describes the possible alternatives to obtaining the optimal alternative rank. The ARAS method compares the utility functions of alternatives with optimal utility function values [57].
Like the classical MCDM approach, ARAS focuses on the ranking of criteria. Ranking with ARAS is done in several stages [55]. The first stage is forming a decision-making matrix. The matrix consists of 0 − m alternatives (rows) and 1 − n criteria (columns). If i represents the number of alternatives, j is the number of criteria. The decision-making matrix is denoted as follows: The x 0j optimal value of the criterion is the best value that can be used to represent the performance on each j criterion. In this paper, x 0j optimal criterion is defined as follows: The next stage is normalizing all the criteria defined from x ij of the matrix X. The normalized decision-making matrix X is defined as follows: Normalization of benefits criteria can be done using the following formula: Meanwhile, normalization of cost criteria can be done using the normalized two-stage procedure following the notation: The next step is defining the normalized-weighted matrix, starting with determining the value of w j . The sum of weights of all the criteria is 1, and the weight w j is limited as follows: Electronics 2022, 11, 114 7 of 17 After that, the normalized-weighted matrix is calculated using the following formula: Normalization using weight w j for all criteria can be calculated using the following formula: where w j is the weight of criterion j, andx ij is the normalized ranking of criterion j. The next step is to calculate the values of the optimality function using the following formula: The final step in the ARAS model is to determine the ranking of the alternatives. If S i and S 0 are optimality criterion values, then the ranking K for alternatives i follows the definition:

Experimental Design
In this section, the stages of the proposed methodology will be discussed. Three steps comprise the proposed method: preprocessing, machine learning, and the decision system. The first step is preprocessing the dataset, and the preprocessing stages aim to improve the quality of the data. Furthermore, preprocessing is performed to make the dataset more visible and is considered to improve the machine learning algorithm [58,59]. Data preprocessing concerns cleaning data, transforming categorical data to numerical form, and normalizing data.
After the preprocessing, the next step is the machine learning phase. This phase involves feature selection methods, classification, and performance evaluation. The feature selection method determines the best subset of features from the dataset. In the classification stage, a decision tree classifier is employed to generate the performance of models such as accuracy, precision, recall, f1-score, weighted precision, weighted recall, weighted f1-score, train time, and inference time using the selected feature from the previous stage.
The next stage is the decision system phase. In this step, the performance metrics are compared to determine the rank of the feature selection methods. ARAS uses the performance matrices as the ranking criteria, and this method is essential for formulating the best feature selection methods. The final result, ARAS, presented the feature selection method ranking. The stages of the proposed methodology are depicted in Figure 1. After that, the normalized-weighted matrix is calculated using the following formula: Normalization using weight for all criteria can be calculated using the following formula: * ; = 0, , where is the weight of criterion , and is the normalized ranking of criterion . The next step is to calculate the values of the optimality function using the following formula: The final step in the ARAS model is to determine the ranking of the alternatives. If and are optimality criterion values, then the ranking for alternatives follows the definition:

Experimental Design
In this section, the stages of the proposed methodology will be discussed. Three steps comprise the proposed method: preprocessing, machine learning, and the decision system. The first step is preprocessing the dataset, and the preprocessing stages aim to improve the quality of the data. Furthermore, preprocessing is performed to make the dataset more visible and is considered to improve the machine learning algorithm [58,59]. Data preprocessing concerns cleaning data, transforming categorical data to numerical form, and normalizing data.
After the preprocessing, the next step is the machine learning phase. This phase involves feature selection methods, classification, and performance evaluation. The feature selection method determines the best subset of features from the dataset. In the classification stage, a decision tree classifier is employed to generate the performance of models such as accuracy, precision, recall, f1-score, weighted precision, weighted recall, weighted f1-score, train time, and inference time using the selected feature from the previous stage.
The next stage is the decision system phase. In this step, the performance metrics are compared to determine the rank of the feature selection methods. ARAS uses the performance matrices as the ranking criteria, and this method is essential for formulating the best feature selection methods. The final result, ARAS, presented the feature selection method ranking. The stages of the proposed methodology are depicted in Figure 1.  The proposed model evaluates the feature selection method using two metrics, i.e., model performance and computation performance. Model performance is a measurement of machine learning performance using selected features, and in contrast, computational performance refers to computational capabilities during the training and inference process. Experiments and evaluations are carried out on seven methods and one baseline model, which is a model that uses all features. The schematic detail of the criteria selection of the feature selection method is portrayed in Figure 2. The proposed model evaluates the feature selection method using two metr model performance and computation performance. Model performance is a measu of machine learning performance using selected features, and in contrast, compu performance refers to computational capabilities during the training and inferen cess. Experiments and evaluations are carried out on seven methods and one b model, which is a model that uses all features. The schematic detail of the criteria se of the feature selection method is portrayed in Figure 2.

Dataset Description
The psychosocial education dataset used here refers to the research [29,60] to proposed method. It is a public dataset obtained from a psychosocial assessment t tify Colombia's teachers' stress levels. The dataset consists of 4890 instances and 1 tures divided into six domains. The complete specification of the dataset can be Table 1.

Dataset Preprocessing
In a machine learning problem, the dataset is present to demonstrate the effect of the proposed method. Therefore, a high-quality dataset is required to evaluate t posed model against the existing model. Data prepossessing is a well-known techn improve dataset quality.
The teacher's psychosocial risk level dataset is valuable and pristine, and it p the basis for delivering research on the degree of psychosocial distress among teac Columbia. Several studies have been conducted using the same dataset [29,60]. Pri the dataset was preprocessed appropriately. It will still be necessary for us to p several preprocessing steps to prepare a suitable dataset for the proposed method

Dataset Description
The psychosocial education dataset used here refers to the research [29,60] to test the proposed method. It is a public dataset obtained from a psychosocial assessment to identify Colombia's teachers' stress levels. The dataset consists of 4890 instances and 118 features divided into six domains. The complete specification of the dataset can be seen in Table 1.

Dataset Preprocessing
In a machine learning problem, the dataset is present to demonstrate the effectiveness of the proposed method. Therefore, a high-quality dataset is required to evaluate the proposed model against the existing model. Data prepossessing is a well-known technique to improve dataset quality.
The teacher's psychosocial risk level dataset is valuable and pristine, and it provides the basis for delivering research on the degree of psychosocial distress among teachers in Columbia. Several studies have been conducted using the same dataset [29,60]. Primarily, the dataset was preprocessed appropriately. It will still be necessary for us to perform several preprocessing steps to prepare a suitable dataset for the proposed methods.
The first step involves performing common preprocessing steps, such as clearing improper data and handling missing values. Then, we divided the data into two subsets by following the Pareto distribution rule [61]. In this case, 80% of the data was used for Electronics 2022, 11, 114 9 of 17 training, and 20% was used for testing. A randomly selected distribution is made to ensure fair data distribution. The next step is to apply standardization to rescale the distribution of each dataset subset. By performing a standardization transformation of the dataset, each feature dataset will have a mean value of 0 with a standard deviation value of 1. Hopefully, a preprocessed dataset will lead machine learning to the optimal model.

Evaluation of Performance Metrics for Feature Selection Methods
Evaluation is done to measure machine learning performance. Generally, machine learning performance is measured by using a confusion matrix. A confusion matrix combines actual value and predicted value in the classifier. The confusion matrix is True Positive (TP), False Positive (FP), True Negative (TN), and False Negative (FN). The matrices for measuring accuracy, precision, recall, and f1-Score are obtained from the following calculation: The fundamental concept for calculating the confusion matrix is binary classification [62]. A single comparison is made between two classes in binary classification, while this single comparison becomes irrelevant in multi-class classification [63]. Each class's precision, recall, and f1-score are estimated as micro-averaged and macro-averaged. After that, the metrics are calculated using the one vs. all method [64]. For example, the microaveraged scores and macro-averaged precision (PRE) scores in the k-class are defined as follows [65]: PRE micro = TP 1 + · · · + TP k TP 1 + · · · + TP k + FP 1 + · · · + FP k (23)

Results and Discussion
This section reviews the performance evaluation of the proposed method. We actualized the discussion in two parts: the performance of each feature selection method on the psychosocial education dataset and the implementation of ARAS in selecting the best feature selection method. Analysis and evaluation will also be conducted by comparing performance against a single criterion. In this case, accuracy criteria are used as a comparison.

Performance Analysis of the Feature Selection Method
This section discusses the performance measures of the feature selection method. The feature selection reduces the dimension by eliminating the least important features and retaining the important ones. It is expected that, by reducing the dimensions, the model and computational performance will increase. If the baseline used all 118 features, the other methods only performed the subset features according to the algorithm. Table 2 shows the selected feature for each method. The performance measure of the feature selection method is carried out to obtain performance parameters that will be used as the criteria for ARAS in the future. The measurement consists of performance models such as accuracy, precision, recall, f1-score, weighted precision, weighted recall, and weighted f1-score. The computation performance consists of train time and inference time. From a series of experiments conducted, what is interesting is that the baseline model requires the longest training time (34.3910 s) compared to other feature selection methods. It is decent because the baseline model used all the features in the psychosocial education datasets. However, the baseline model produced lower results than other models with far fewer selection features in the accuracy metric. Details of performance matrices for each feature selection method can be seen in Table 3.

Evaluation Feature Selection Method Using ARAS
At this stage, choosing the best feature selection method is performed. ARAS determines the ranking using performance metrics from each feature selection method. The first step is to initialize the decision-making matrices for each alternative and their respective criteria pairs. By assigning each feature selection method as an alternative, and assignment performance matrices, i.e., accuracy (A), precision (P), recall (R), f1-score (FS), weighted precision (WP), weighted recall (WR), weighted f1-score (WFS), train time (TT), and inference time (IT) as criteria x n . Based on the analysis, it is determined that the value of the criteria x 1 -x 7 are the benefit, while x 8 and x 9 are as the cost. In addition, it is also determined that the weighted value (w) of criteria x 1 is 0.2 and criteria x 2 -x 9 is 0.1, with the sum of their weighted values of 1. Criteria x 1 gets a higher weight because, in real problems, accuracy is one of the most important performance matrices that is widely used as a benchmark for machine learning [66,67]. The initial decision-making matrix's complete formation with each criterion's weight and optimization is shown in Table 4.
After the initial decision matrix is completed, the next step is to normalize the decision matrix. The step is finding the optimal value of A 0 value. The max operator is used for criteria with the benefit value, and the min operator is used for criteria with the cost value using Equation (10): After obtaining the value A 0 j , all of the criteria in the matrix are normalized. The decision matrix is normalized using Equation (12) for benefit and Equation (13) for cost. The formation of the normalization of the decision matrix X is shown in detail in Table 5, and for example of calculating the values of x 1 (A 0 ) and x 1 (Baseline) are as follows: x 1 (Baseline) = 0.1115  After the normalization of the matrix X is obtained, the next step is to perform the weighted normalization by multiplying the criteria weight by the normalized weighted matrix according to the formula (16). The results of weighted normalization are presented in detail in Table 6. Next, the optimal value S i is calculated, where S i is the value of the ideal function of alternative i. After that, the criteria K i is ranked using Equations (17) and (18). Meanwhile, the value K i is calculated by dividing the value S i by the value S 0 . For the values S 0 , K Baseline and K ANOVA can be computed as follows: S 0 = 0.0224 + 0.0113 + 0.0112 + 0.0112 + 0.0112 + 0.0112 + 0.0112 + 0.0201 + 0.0112 In detail, the calculation of the optimal value K is presented in Table 7. Then, based on the results of the K i , the final results of the rankings are shown in Table 8.  The decision results using ARAS show that the ERF method is top-ranking, and the baseline is at the lowest rank. ERF with 11 features gives better results than the baseline, which uses 118 features. It shows that selecting the best subset features is still relevant to machine learning problems.
We compare the ARAS rank with single machine learning measurements such as accuracy. In that case, the results obtained tend to be the same while on ARAS: ERF > Lasso > MI > Chi-square > Anova > RFE > EFS > Baseline, and on the other hand, using accuracy, the performance order is obtained as follows: ERF > Lasso > MI > Chi-square > Anova > Baseline > RFE > EFS. It happens because the overall performance produced by feature selection methods is mostly stable, so there are no models with cross-dominating criteria. To consider the dominating performance result, Figure 3 shows the comparative performance of every model. We compare the ARAS rank with single machine learning measurements such as accuracy. In that case, the results obtained tend to be the same while on ARAS: ERF > Lasso > MI > Chi-square > Anova > RFE > EFS > Baseline, and on the other hand, using accuracy, the performance order is obtained as follows: ERF > Lasso > MI > Chi-square > Anova > Baseline > RFE > EFS. It happens because the overall performance produced by feature selection methods is mostly stable, so there are no models with cross-dominating criteria. To consider the dominating performance result, Figure 3 shows the comparative performance of every model.
The experiment shows that the machine learning phase accomplished the model's performance analysis. By selecting specific metrics, the aim of the performance of machine learning can be defined. For example, the accuracy metric can be used as a benchmark metric to find the best accuracy model. Nevertheless, a decision model to measure and evaluate the overall performance metrics of feature selection methods is still necessary.
Finally, the goal of the proposed method is to show that the proposed model can resolve the problem formulation. Theoretically, this methodology is relevant and should be proposed. ARAS can perform a fair mapping in ranking the feature selection methods in the psychosocial education domain, especially to identify Colombia's teachers' stress level problems. However, this methodology has not fully demonstrated the significance of performance evaluation in the current dataset case, where several dominant criteria ultimately dictate the ranking results. More experience is necessary to provide a robust comparison and conclusion, and more experience based on a similar dataset might provide better results.

Conclusions
ARAS has proven effective and can be implemented as an evaluation model to determine the best feature selection method in the psychosocial education dataset. The evaluation used performance matrices to rank the feature selection methods. From the evaluation that has been accomplished, the determination of weight and optimization value plays an essential role in the ARAS model. Giving subjective weights affects the overall ARAS ranking.
Regarding future research directions, we recommend further investigation on the The experiment shows that the machine learning phase accomplished the model's performance analysis. By selecting specific metrics, the aim of the performance of machine learning can be defined. For example, the accuracy metric can be used as a benchmark metric to find the best accuracy model. Nevertheless, a decision model to measure and evaluate the overall performance metrics of feature selection methods is still necessary.
Finally, the goal of the proposed method is to show that the proposed model can resolve the problem formulation. Theoretically, this methodology is relevant and should be proposed. ARAS can perform a fair mapping in ranking the feature selection methods in the psychosocial education domain, especially to identify Colombia's teachers' stress level problems. However, this methodology has not fully demonstrated the significance of performance evaluation in the current dataset case, where several dominant criteria ultimately dictate the ranking results. More experience is necessary to provide a robust comparison and conclusion, and more experience based on a similar dataset might provide better results.

Conclusions
ARAS has proven effective and can be implemented as an evaluation model to determine the best feature selection method in the psychosocial education dataset. The evaluation used performance matrices to rank the feature selection methods. From the evaluation that has been accomplished, the determination of weight and optimization value plays an essential role in the ARAS model. Giving subjective weights affects the overall ARAS ranking.
Regarding future research directions, we recommend further investigation on the proposed method on different datasets with conditions where each criterion contradicts and does not predominate the other. The problem associated with imbalanced datasets that show uneven and contradictory performance matrices can be challenging. This problem is expected to measure the extent of ARAS's ability to provide an optimal ranking.