Explainable Machine Learning Models for Identiﬁcation of Food-Related Lifestyle Factors in Chicken Meat Consumption Case in Northern Greece

: A consumer’s decision-making process regarding the purchase of chicken meat is a multi-faceted one, inﬂuenced by various food-related, personal, and environmental factors that interact with one another. The mediating effect of food lifestyle that bridges the gap between consumer food values and the environment, further shapes consumer behavior towards meat purchase and consumption. This research introduces the concept of Food-Related Lifestyle (FRL) and aims to identify and explain the factors associated with chicken meat consumption in Northern Greece using a machine learning pipeline. To achieve this, the Boruta algorithm and four widely recognized classiﬁers were employed, achieving a binary classiﬁcation accuracy of up to 78.26%. The study primarily focuses on determining the items from the FRL tool that carry signiﬁcant weight in the classiﬁcation output, thereby providing valuable insights. Additionally, the research aims to interpret the signiﬁcance of these selected factors in the decision-making process using the SHAP model. Speciﬁcally, it turns out that the freshness, safety, and nutritional value of chicken meat are essential considerations for consumers in their eating habits. Additionally, external factors like health crises and price ﬂuctuations can have a signiﬁcant impact on consumer choices related to chicken meat consumption. The ﬁndings contribute to a more nuanced understanding of consumer preferences, enabling the food industry to align its offerings and marketing efforts with consumer needs and desires. Ultimately, this work demonstrates the potential of AI in shaping the future of the food industry and informs strategies for effective decision-making.


Introduction
The consumption of red meat, especially in Western societies, has undergone changes in the last decade and is influenced by various factors, such as socio-economic, religious beliefs, nutritional scandals, ethical concerns and tradition [1].However, the growing lifestyle-related diseases, like obesity, or diabetes, have led consumers on chicken meat consumption, as a healthier option, due to less fat and cholesterol [2].Thus, chicken meat shows differentiation from other meats since it is considered healthier, cheaper, more convenient and without religious restrictions on dietary choice [3][4][5][6][7][8][9][10].
A consumer's decision to purchase chicken meat is found to be a complex process determined by a variety of food-related, personal, as well as environmental factors interacting with each other [11,12].In addition to the internal and external factors that affect meat consumption, consumer's behavior towards meat purchase and consumption could be shaped by the influence of food lifestyle, which is found to mediate between consumer's food values and the environment [3].As a consequence, the concept of Food-Related Lifestyle (FRL) has been introduced to describe the possible areas of lifestyle trends that might frame consumer behavior, leading beyond the actions of the individual [13,14].The FRL tool tries to identify the changes in food lifestyle and consists of five components: (a) ways of shopping, (b) quality aspects, (c) cooking methods, (d) consumption situations, and (e) purchase motives [14].The suitability and validity of the FRL instrument have been certified through its application in several European countries in terms of consumers' meat behavior [2,13,15,16].In relation to chicken meat consumption, some studies used the FLR framework for segmenting consumers according to their sociodemographic variables and their purchasing preferences, in order to obtain different consumer profiles [2,13].
Moreover, the COVID-19 pandemic has caused changes in meat consumption patterns.Overall, international meat prices decreased in 2020, since some leading countries in consuming and importing have temporarily limited meat demand, due to the impact of the coronavirus in 2019, which led to logistical obstacles, increased livestock feeding costs, reduced food services, and limited household spending due to lower incomes [17].The latter has forced consumers to restrict their intake of meat products to cheaper options such as poultry and pork.According to "Agricultural Outlook 2021-2030", consumers prefer poultry meat because of its lower price, its product stability and adaptability to market requirements, and its higher protein content with lower fat.Thus, in developing countries poultry consumption reflects a lower price, while in high-income countries, poultry consumption reflects a healthier and more convenient food choice.In Greece, meat consumption was estimated to have decreased by −7% in 2020 compared to 2019, due to the health crisis, which led to reduced demand for meat, mainly in the catering sector (restaurants, hotels, and mass catering units) (Meatnews, 2022).In particular, for poultry meat in the period 2019-2020, domestic consumption decreased by −0.6%, while in 2021 it remained approximately at the same levels compared to the previous year.At the same time, annual per capita consumption increased compared to 2018 and amounted to approximately 23 kg of poultry meat and remained stable in the period 2019-2021.Also, according to the ICAR-CRIF S.A. study (Meatnews, 2022), it has appeared first in consumers' preferences since 2018.Therefore, in this study we try to identify consumers' influences that affect chicken meat consumption during the pandemic period using the FRL instrument in combination with Artificial Intelligence (AI) In recent years, the use of AI in the food industry has increased significantly.AI is transforming the food industry by offering innovative solutions to challenges in food production, distribution, and consumption [18][19][20].From precision agriculture that utilizes sensors and machine learning to optimize crop yields, to personalized nutrition that tailors diets to individual needs and preferences, AI is improving the efficiency, sustainability, and healthfulness of our food systems [21,22].By harnessing the power of big data and machine learning, AI is enabling a more transparent, traceable, and equitable food system that benefits consumers, producers, and the environment alike.Thus, one area where AI has the potential to make a significant contribution is in the identification of consumer profiles for chicken meat based on food-related lifestyles.By analyzing large datasets related to consumer behavior, preferences, and attitudes toward food, AI algorithms can identify patterns and correlations that reveal important insights about consumer segments.These insights can be used by producers and marketers to tailor their products and messaging to specific target groups.Hence, the scope of this study is to (i) develop an explainable machine learning (ML) methodology for the identification of Food-Related Lifestyle factors in the chicken meat consumption case in Northern Greece and (ii) to interpret the contribution of the selected factors in decision-making.

Study Area and Data Collection
The primary data were collected after the lifting of the restrictive measures due to the COVID-19 pandemic, between June and September 2020, in Northern Greece.The sample of 689 consumers was selected randomly during their purchases in food stores such as supermarkets, butchers, specialized butchers and open markets, during their meals in fast food and restaurants, and at various times during the day and all days of the week in the urban complex of the city of Thessaloniki.Each questionnaire took about fifteen to twenty minutes to answer.The questionnaire was structured in 4 sections: (1) consumers' habits in chicken meat consumption; (2) quality and safety consumers' beliefs towards chicken meat; (3) food-related lifestyle trends towards food and the pandemic situation; and (4) consumers and their household characteristics.Furthermore, participation in the survey was voluntary and anonymous, and no sensitive personal information was requested.In addition, participants were informed about the purpose of the study and the use of the data, and they were assured that their answers would remain confidential and would be used exclusively in the context of the research.Finally, the consumers verbally stated that they consumed and bought chicken meat by themselves, and it was considered that they answered the questions of the questionnaire honestly.Thus, the consumers in the present study are considered to be the end users who compose the final chicken meat chain.

Data Description
The employed variables in our study were sociodemographic variables in order to describe the status of the consumers, and the questions of the Food-Related Lifestyle (FRL) [2,[14][15][16] were employed in the proposed ML model.Specifically, a reduced version of the FRL model was preferred so that the respondents would not be overloaded with a lot of questions (69 questions in the full version).The FRL instrument of the study, which was conducted to segment and profile consumers, included 20 questions regarding the aspects of food-lifestyle culture as well as the consumers' food perceptions during the pandemic situation (Appendix A).The questions of the FRL tool were measured on a 5-point Likert scale ranging from 5 "strongly agree" to 1 "strongly disagree" whereas a neutral midpoint was at 3 "neither agree nor disagree".As the outcome assessment, we used the variable chicken meat purchase intention.

Problem Definition
In this study, we worked on a binary classification problem for the identification of FRL factors in chicken meat consumption case in Northern Greece.Specifically, the participants of the study were divided into two classes (Figure 1): (1) rare chicken consumers, consumers who buy chicken meat one or two times in 3 months (n = 319), and (2) chicken consumers, consumers who buy chicken meat at least 2 times a month to every week (n = 370).

Machine Learning Workflow
An explainable multi-stage ML pipeline was proposed in order to identify which FRL factors are important for the classification of potential customers when it comes to the chicken market in Northern Greece.The proposed ML methodology can be seen in Figure

Machine Learning Workflow
An explainable multi-stage ML pipeline was proposed in order to identify which FRL factors are important for the classification of potential customers when it comes to the chicken market in Northern Greece.The proposed ML methodology can be seen in Figure 2.

Machine Learning Workflow
An explainable multi-stage ML pipeline was proposed in order to identify which FRL factors are important for the classification of potential customers when it comes to the chicken market in Northern Greece.The proposed ML methodology can be seen in Figure 2.

Feature Engineering
In order to deal with the missing values, we used the most frequent values (mode strategy, available on https://scikit-learn.org/stable/modules/generated/sklearn.impute.SimpleImputer.html,accessed on 25 June 2023).Additionally, we used the Standard-Scaler function from the scikit-learn package as a standardization technique [23].The Boruta algorithm was applied to rank the various features in order of importance [24,25].It is a supervised learning algorithm that provides a feature importance ranking and is effective in problems where the dataset for model building is comprised of several variables.Specifically, the Boruta algorithm provides a ranking of features by assigning importance scores based on their performance compared to shuffled counterparts.Features with higher importance scores are ranked higher, indicating their relative significance in the dataset for subsequent modeling or analysis.

Feature Engineering
In order to deal with the missing values, we used the most frequent values (mode strategy, available on https://scikit-learn.org/stable/modules/generated/sklearn.impute.SimpleImputer.html, accessed on 25 June 2023).Additionally, we used the StandardScaler function from the scikit-learn package as a standardization technique [23].The Boruta algorithm was applied to rank the various features in order of importance [24,25].It is a supervised learning algorithm that provides a feature importance ranking and is effective in problems where the dataset for model building is comprised of several variables.Specifically, the Boruta algorithm provides a ranking of features by assigning importance scores based on their performance compared to shuffled counterparts.Features with higher importance scores are ranked higher, indicating their relative significance in the dataset for subsequent modeling or analysis.

Learning Process
Four well-known ML classifiers were evaluated for their suitability [26].To ensure that the ML models perform well on the employed data, as well as to avoid bias error and overfitting, hyperparameter selection was implemented.Specifically, we used Random Forest (RF), which is an ensemble learning algorithm known for its fast execution speed and increased model performance.Additionally, SVM algorithms were employed, leveraging their effectiveness in high-dimensional spaces.Another algorithm that was employed was Logistic Regression (LR).This model uses a generalized linear model in order to predict a binary outcome based on prior observations of a dataset.Neural networks (NNs) were also tested as they can recognize relationships between complex data using a series of algorithms that mimic the operations of a human brain.Different activation functions were tested (including tanh, sigmoid, and ReLU) for their suitability as part of the hyperparameter optimization phase.

Evaluation
For the evaluation of the proposed classifiers, a stochastic 70-30% random data split was applied to generate the training and testing subsets, respectively.Specifically, the learning was performed on the stratified version of the training sets, and the final perfor-mance was estimated using the accuracy testing sets.Furthermore, the performance of the classifiers was also evaluated in terms of recall (or sensitivity), f1-score, and precision as additional evaluation criteria.Additionally, confusion matrices were utilized in order to evaluate the classifiers' performance.A confusion matrix is a means of evaluating a classifier's performance, acting as a summary of prediction results on a classification problem.It is comprised of the count of correct (true) and incorrect (false) predictions, broken down by each class.

Explainability
Aiming to better understand the rationale behind the decision-making process of the trained AI models, we additionally (i) investigated how much each of the features contributed to the final decision and (ii) estimated each feature's importance.To that end, we resorted to SHapley Additive explanations (SHAP), which are based on Shapley Values of game theory [27][28][29].SHAP is a mathematical method that explains the predictions of machine learning models by calculating the level of each feature that contributes to the final prediction.This enhances our understanding of the internal decision-making rationale of the trained AI models, especially with respect to the mechanism with which selected food-related lifestyle parameters are combined to produce decisions regarding potential customer prediction.

Results
In this section, the descriptive statistics of the employed consumers and the FRL variables, the testing performance metrics of the comparative analysis, the selected features for the best-performing ML model, and the interpretation of the model output of the best-performing ML classifier are presented below.

Prediction Performance
Figure 3 demonstrates the accuracy testing performance (%) of the competing ML models with respect to the number of selected features in the 2-class problem.Initially, the features of the employed data were ranked by using the Boruta algorithm.At this point, we should mention that the ML models were trained on feature subsets of increasing dimensionality (with a step of 1), and the testing classification accuracies were finally calculated until the full feature set had been tested.It turned out that the majority of the ML models had an upward trend in the whole feature dimensionality range, followed by a steady testing performance in most cases.Specifically, the Random Forest model showed an upward trend with respect to the first selected features, with fluctuations across multiple features, with a maximum of 78.26% at 14 features, which is the overall best performance.The SVM model achieved the second-best accuracy (75.36%), also following the pattern of fluctuating, non-steadily increasing performance across different selected features.In contrast with the other models, Logistic Regression fell short in this task, recording low accuracy testing performances below 64%.
BioMedInformatics 2023, 3, FOR PEER REVIEW 7 the features of the employed data were ranked by using the Boruta algorithm.At this point, we should mention that the ML models were trained on feature subsets of increasing dimensionality (with a step of 1), and the testing classification accuracies were finally calculated until the full feature set had been tested.It turned out that the majority of the ML models had an upward trend in the whole feature dimensionality range, followed by a steady testing performance in most cases.Specifically, the Random Forest model showed an upward trend with respect to the first selected features, with fluctuations across multiple features, with a maximum of 78.26% at 14 features, which is the overall best performance.The SVM model achieved the second-best accuracy (75.36%), also following the pattern of fluctuating, non-steadily increasing performance across different selected features.In contrast with the other models, Logistic Regression fell short in this task, recording low accuracy testing performances below 64%.Table 3 summarizes the best performance metrics such as accuracy, recall, precision, and f1-score of the employed ML classifiers.It also presents the number of features and confusion matrices.Best performances were achieved from the RF classifier.Specifically, RF achieved 78.26% accuracy, 80.36% recall, 79.65% precision, and 80.00% f1-score.On the Table 3 summarizes the best performance metrics such as accuracy, recall, precision, and f1-score of the employed ML classifiers.It also presents the number of features and confusion matrices.Best performances were achieved from the RF classifier.Specifically, RF achieved 78.26% accuracy, 80.36% recall, 79.65% precision, and 80.00% f1-score.On the contrary, the LR classifier achieved the lowest performance scores (61.35% accuracy, 68.75% recall, 63.12% precision, and 65.81% f1-score).Table 4 shows the ranking of the selected factors based on the Boruta feature-selection algorithm for the best-performing ML model in our approach.

Explainability
The impact of the FRL items on the output of the best-performing model (RF) is illustrated in Figure 4.The figure demonstrates the relationships, both negative and positive, between the FRL items and the purchase preference for chicken meat, which is the target variable.A positive relationship indicates that an increase in the value of a specific feature is associated with an increase in the model's output, while a negative relationship indicates the opposite.Figure 4a presents the SHAP summary values of the FRL items, arranged in descending order from top to bottom based on their impact.The color coding represents the individual observation's high (red) or low (blue) value for each FRL item.The FRL item "FRL7" indicates a high and positive impact on the purchase preference for chicken meat, meaning that as the participants' purchase frequencies increase, the model's output tends to lean towards classifying them as chicken consumers.In other words, there is a correlation between FRL7 and the likelihood of purchasing chicken meat.Conversely, FRL3 exhibits a negative correlation with the purchase of chicken meat.In Figure 4b, the average impact of each FRL item on the magnitude of the model's output is depicted.It can be observed that FRL7, FRL20, and FRL13 have significant contributions to the model's output, while FRL items like FRL17, FRL15, FRL18, and FRL12 have a moderate to low impact on the model's output.

Discussion
The primary objective of the current study is to create a machine learning pipeline that can provide explanations for the identification of Food-Related Lifestyle factors in the context of chicken meat consumption in Northern Greece.The task involves classifying participants into two groups: rare chicken consumers and chicken consumers.To achieve this, four widely recognized classifiers were utilized.The accuracy of the binary classification task reached 78.26%.The main focus was to determine which items from an established tool carry more weight in the classification output and provide valuable insights.In addition, the secondary goals of the research included interpreting the significance of the selected factors in the decision-making process.
We employed the Boruta algorithm, which utilizes the RF classifier, as our chosen feature-selection technique.We opted for this algorithm due to its reliability and automation, enabling us to enhance model performance, decrease dimensionality, and extract valuable insights from datasets with numerous dimensions.Moreover, Boruta assesses feature importance by considering the performance of the selected model, granting us flexibility in choosing the most appropriate model for our specific objective.In our particular task, the five highest-ranked features were FRL10 "I check the expiration dates of food", FRL16 "After the pandemic I prefer not eating out", FRL13 "I prefer to buy products firstly for their nutritional value and then for their taste", FRL7 "I like to read the labels of the food products that I buy so as to know what they contain", and FRL9 "I check the prices and compare them".
In order to comprehend the influence of FRL items on chicken meat purchase intention, we employed the SHAP model for interpreting the model output.The reason behind selecting this model was its ability to determine the contribution of each feature to the prediction by examining various feature combinations and their corresponding outcomes.This approach guarantees a just and consistent distribution of importance among the features.The selection of item FRL10 "I check the expiration dates of food" as the most important in consumers' chicken meat choice constructs food involvement in consumers' chicken meat choice.The expiration date may reflect the perceived quality for meat freshness and the level of safety in chicken meat consumption [6,30], so consumers may consider the expiration date as an indirect way of evaluating chicken meat quality.Thus, the belief in the freshness and safety of chicken meat through the expiration date is considered

Discussion
The primary objective of the current study is to create a machine learning pipeline that can provide explanations for the identification of Food-Related Lifestyle factors in the context of chicken meat consumption in Northern Greece.The task involves classifying participants into two groups: rare chicken consumers and chicken consumers.To achieve this, four widely recognized classifiers were utilized.The accuracy of the binary classification task reached 78.26%.The main focus was to determine which items from an established tool carry more weight in the classification output and provide valuable insights.In addition, the secondary goals of the research included interpreting the significance of the selected factors in the decision-making process.
We employed the Boruta algorithm, which utilizes the RF classifier, as our chosen feature-selection technique.We opted for this algorithm due to its reliability and automation, enabling us to enhance model performance, decrease dimensionality, and extract valuable insights from datasets with numerous dimensions.Moreover, Boruta assesses feature importance by considering the performance of the selected model, granting us flexibility in choosing the most appropriate model for our specific objective.In our particular task, the five highest-ranked features were FRL10 "I check the expiration dates of food", FRL16 "After the pandemic I prefer not eating out", FRL13 "I prefer to buy products firstly for their nutritional value and then for their taste", FRL7 "I like to read the labels of the food products that I buy so as to know what they contain", and FRL9 "I check the prices and compare them".
In order to comprehend the influence of FRL items on chicken meat purchase intention, we employed the SHAP model for interpreting the model output.The reason behind selecting this model was its ability to determine the contribution of each feature to the prediction by examining various feature combinations and their corresponding outcomes.This approach guarantees a just and consistent distribution of importance among the features.The selection of item FRL10 "I check the expiration dates of food" as the most important in consumers' chicken meat choice constructs food involvement in consumers' chicken meat choice.The expiration date may reflect the perceived quality for meat freshness and the level of safety in chicken meat consumption [6,30], so consumers may consider the expiration date as an indirect way of evaluating chicken meat quality.Thus, the belief in the freshness and safety of chicken meat through the expiration date is considered to be a strong food attribute that leads to eating habits.Additionally, the above result appears to be consistent with the findings of other studies, such as those from Stamatopoulou and Tzimitra-Kalogianni 2022 [31]; Katiyo et al., 2020 [32]; Strašek, 2010 [33]; and Hall and Sandilands, 2007 [34].Also, FRL 16 "After the pandemic I prefer not eating out" indicates the perception found by Wang et al. [35], that consumers' eating behavior has been influenced by their concerns due to the health crisis.
Moreover, the items FRL13 "I prefer to buy products firstly for their nutritional value and then for their taste" and FRL7 "I like to read the labels of the food products that I buy so as to know what they contain" strongly reveals consumers' food values toward chicken meat consumption.According to Verbeke et al. [36], the nutritional value of meat demonstrates a very important value that determines the final choice of consumers.The above food values are in compliance with the relevant literature [6,[8][9][10][37][38][39][40][41][42].Furthermore, these results that reveal consumers' interest in label information for healthy meat enhances the importance of consumers' food involvement in their purchasing decisions [43].Finally, the item FRL9 "I check the prices and compare them" indicates consumers' food involvement that may affect their final choice."The value for money" for chicken meat reflects consumers' food value that may contribute to their overall purchasing finaldecision, since there is a link between food values and food choices, according to Krystallis et al. [44].In accordance with these results, several studies showed that consumers' perceptions of chicken meat prices can influence chicken meat choice [3,4,[7][8][9]30,30,31,39,41,42,45,46].
At this point, we should mention that SHAP analysis serves as a valuable tool to identify the overall impact of features on our results, providing insight into their collective influence.In contrast, our feature-selection algorithm focuses on pinpointing the most relevant features for a more refined analysis, ensuring the extraction of critical information.Hence, it is important to note that while Figure 4 highlights the SHAP features with the most significant impact on model output, the discussion section delves deeper into the features identified through our specific feature-selection algorithm.This approach allows us to provide a comprehensive understanding of our findings by considering both the broader perspective and the features critical to our analysis.
One limitation of our research lies in the utilization of a modified Food-Related Lifestyle (FRL) instrument, which entails the use of a reduced number of questions instead of the full set.Despite this fact, it did not impact the current findings.According to Buitrago-Vera et al. [15], in order to adapt the FRL model to each research context, this may include different numbers of elements, with each dimension able to provide factors that describe the food lifestyles of consumers.Thus, according to relevant studies [14][15][16], the FRL model of the present study was preferred to include twenty (20) elements (questions) corresponding to all five (5) food dimensions that the FRL model usually contains.Moreover, this survey focused only on consumers in the Thessaloniki agglomeration and did not include the attitudes and perceptions of others involved in the chicken meat supply chain.Postpurchase quality control was not included, because it was not a specific type of chicken meat that was controlled, but chicken meat in general.In future endeavors, we plan to address this limitation by employing the complete FRL tool, encompassing all relevant questions, and extending our research to multiple regions of Greece.To ensure a representative sample, we will adopt a stratified selection approach when recruiting participants for the study.Furthermore, in our upcoming work, we aim to enhance the interpretability of our machine learning pipeline by incorporating graphical models as explanation methods.These graphical models (Graph Neural Networks (GNNs) or Bayesian Networks) will provide visual representations of the relationships and dependencies among various factors, contributing to a more comprehensive understanding of the decision-making process in the classification of Food-Related Lifestyle factors.By leveraging these advanced techniques, we anticipate achieving a deeper level of insight and explanation in our research.

Conclusions
In conclusion, our study successfully developed a machine learning pipeline that not only achieved a binary classification task with up to 78.26% accuracy, but also focused on providing explanations for identifying Food-Related Lifestyle factors in chicken meat consumption within Northern Greece.The interpretability of our machine learning pipeline offers a more comprehensive understanding of the decision-making process involved in the classification of Food-Related Lifestyle factors.Overall, this research underscores the potential impact of AI on the food industry, emphasizing the significance of identifying consumer profiles for effective product development and targeted marketing strategies.Through our efforts, we anticipate achieving deeper insights and explanations, which can ultimately contribute to informed decision-making processes in the food industry.
Appendix A Table A1.Description of the 20 FRL items.

FRL1
I always make a list, before I go shopping for food FRL2 I like shopping for food for me or my family FRL3 I like shopping and tasting gourmet foods FRL4 Eating out with my friends or with my family is an important part of my social life FRL5 Eating is an enjoyment FRL6 I try to schedule the weekly menu, so as not to waste time and money FRL7 I like to read the labels of the food products that I buy to know what they contain FRL8 I like to cook for myself, for my family and my friends FRL9 I check the prices and compare them FRL10 I check the expiration dates of food FRL11 I read recipes and experiment in cooking FRL12 Members of my family like to involve in cooking FRL13 I prefer to buy products firstly for their nutritional value and then for their taste FRL14 I prefer to buy natural products without preservatives FRL15 At home, I eat take away food, at least once a month FRL16 After the pandemic, I prefer not to eat out FRL17 I find cooking tiring FRL18 After the pandemic, I pay attention to the places from where I buy food (cleanliness, without overcrowding) FRL19 After the coronavirus pandemic, I do not trust the takeaway food FRL20 I use the internet to inform me and to entertain me

BioMedInformatics 2023, 3 ,Figure 1 .
Figure 1.Class percentages of the employed participants in this study.

Figure 1 .
Figure 1.Class percentages of the employed participants in this study.

Figure 1 .
Figure 1.Class percentages of the employed participants in this study.

Figure 3 .
Figure 3. Learning curves of employed ML classifiers.

Figure 3 .
Figure 3. Learning curves of employed ML classifiers.

Figure 4 .
Figure 4.This figure depicts (a) the SHAP summary plot and (b) the SHAP feature importance for the RF trained on the features selected by the proposed Boruta FS algorithm.

Figure 4 .
Figure 4.This figure depicts (a) the SHAP summary plot and (b) the SHAP feature importance for the RF trained on the features selected by the proposed Boruta FS algorithm.

Table 1
presents the frequencies of the employed participants in this study.As observed, 61.8% are women, and 38.2% are men.The most populous age groups are 36-45 y (26.1%) and 46-55 y (36.6%).Furthermore, 37.1% of the participants have EUR 1001-1500 in income per month.

Table 2
summarizes the descriptive statistics (mean ± SD) of the employed dataset in our study per group.

Table 2 .
Mean and standard deviation of the employed 20-item FRL tool.

Table 3 .
Best performance metrics of the employed ML models.

Table 4 .
Ranking of the employed features for the best-performing ML model.