1. Introduction
The average consumption of organic products in Western Europe is 3–5%. In Germany, consumption is 5%; in England and Austria, it is 3% [
1]; and in Hungary, it has reached 2%. According to the report “World Organic Agriculture, Statistics and Trends”, 2007 Edition [
2], Australia is the most organic continent according to the criterion of cultivated area, with approximately 11.8 million hectares. Europe is in second place with 6.9 million hectares, followed by Latin America with 5.8 million (Willer et al. [
3]).
At the level of the European Union, organic agriculture represents 3.7% of the total agricultural area, although between 1993 and 1998, an increase of 25% was recorded. Italy has the largest number of organic farms at 31% of the total in the European Union, followed by Austria, Spain, and Germany [
4].
In 2001, the per capita expenditure on eco products was highest in Denmark, at EUR 58.08. Of the products sold in foreign markets, cereals were the most sought after, with a share of 54%. Oilseeds accounted for 22% of exports, fruits and mushrooms 20%, processed dairy products 1%, honey 0.94%, and other organic products 2.06% Environmental Benefits of Organic Farming and Products [
5].
It has been proven over time that organic farming brings numerous environmental benefits, such as the following:
Long-term sustainability—organic farming results in a healthier planet over a longer period of time.
Soil quality—crop rotation, the use of organic fertilizers, and the minimal tillage of agricultural land are practiced to create more stable systems by improving soil quality.
Reduction in water pollution—due to the fact that water is polluted in many agricultural areas, natural fertilizers are used to avoid chemicals.
Slowing down climate change—it is proven that organic farming contributes to mitigating the greenhouse effect and global warming through its ability to sequester carbon in the soil, as suggested by Jiajia Zhang et al. [
6].
Protecting biodiversity—organic farming also plays an important role in preserving biodiversity.
Choosing a food product is an important decision, both for our health and for nature. As mentioned above, organic products have multiple benefits, not only for ourselves, but also for the environment.
Romanian stores offer a wide range of organic products, from fruits and vegetables to cereals, eggs, and dairy products. Out of concern for the health of customers and the environment, the range of organic products available at Romanian stores is constantly enriched. These foods are based only on natural ingredients and do not contain genetically modified organisms, fertilizers and pesticides, or growth stimulants. Each store has areas dedicated to these types of products, which can be easily identified by the label and specific packaging.
2. Literature Review
Malhotra et al. [
7] suggested a positive correlation between favorable attitudes toward organic products and increased purchase intention.
Analyzing consumers’ decision to purchase organic products helps to better understand their environmental behavior toward green products.
Ungurenu et al. [
8] posit that one way to manage environmental problems by involving consumer participation is sustainable consumption. Factors influencing consumer behavior toward organic products, identified in the study by Barbu et al. [
9], include social norms, orientation toward the natural environment, the perceived green image of the company, the characteristics of organic products, the perceived risks and inconveniences of purchasing organic products, the perceived benefits of purchasing organic products [
10], institutional trust, consumer characteristics, demographics, and demographic trust. The novelty of this work lies in its theme: consumer behavior toward purchasing organic products. The purpose of Rosidah Rosidah’s study [
11] was to analyze Generation Z’s perception and knowledge of organic products and their consumption behavior, as well as to identify their knowledge about organic products.
In their study, Shabani Shojaei et al. [
12] introduced the effect of green washing on consumers’ purchasing behavior for organic products. The results confirmed that both perceived green washing and perceived risk have a negative influence on consumer attitudes. It was confirmed that purchasing behavior is positively influenced by attitude and willingness to pay more. The importance of mitigating green washing and risk perceptions associated with organic products, due to their indirect negative impact on purchase intention and behavior, was also highlighted.
“Green” or eco-friendly products have also been the subject of research conducted by R K Pradeep Reddy et al. [
13] on consumer perceptions and preferences in India. Their study explores consumers’ eco ideas, eco-friendly behaviors, and eco-friendly products. The respondents demonstrated strong eco-friendly values due to the high perceived environmental cost to customers, highlighting the importance of creating marketing communication campaigns that promote eco-friendly products. Kianpour et al. [
14] showed that the main reasons for purchasing a product are the desire to protect the environment, consumer education and awareness, laws and regulations, and, most importantly, promotion methods. These findings could help companies, authorities, governments, manufacturers, and retailers understand what motivates customers to buy green products and convince them to buy them. By stimulating and increasing customers’ intention to buy green products, we can help solve environmental problems and reduce the environmental side effects of packaging and the recycling of plastic. Jinhee and Haley [
15] conducted a survey of overseas green consumers in the USA according to their personal, social and environmental motivations. They found that green consumers, like Europeans, are strongly environmentally motivated because they have advanced knowledge about pro-environmental products, are less skeptical of pro-environmental advertising, and have more positive attitudes.
Sörqvist et al. [
16] highlighted the importance of eco-label memory, which can influence the perception of consumption and subsequent consumer decisions. Behavior in two different cultural contexts—Saudi Arabia and the UK—among people familiar with green consumption was studied in research carried out by Shehawy [
17].
The Future of AI in Our Social Life
The internet revolution, according to Gkikas and Theodoridis [
18], led to the emergence of the use of big data and the rise of AI, which came with predictive analytics including machine learning and data mining solutions.
Analyzing and understanding consumer purchasing behavior using artificial intelligence is having a positive impact on marketing [
19]. Marketers study online user behavior to increase the effectiveness of their marketing plans and strategies.
In recent years, artificial intelligence and marketing have begun to work closely together and have changed marketing considerably, spurring innovation across industries.
The practice of predicting future behavior by analyzing customer responses dates back to 1998. With the help of artificial intelligence, companies are building on this by learning from past campaigns, presenting more content that generates positive interactions, and aligning with customer preferences.
Artificial intelligence (AI) allows companies to predict consumer preferences, resulting in a personalized customer experience and the prediction of consumer behavior [
20].
AI technology is having an impact on marketing and consumer engagement strategies. When adopting AI, the focus is on its predictive capabilities to meet the ever-changing needs of customers.
3. Materials and Methods
The purpose of this study was to determine the motivation of customers to buy eco-friendly products using machine learning techniques. Surveys were administered between November and December 2024 to 245 organic consumers at Maramureș County, Romania. At the start of the study, 16 variables were thought to be important categorical factors that impact on consumers’ eco-friendly product buying motivations, as presented in
Table 1.
Seven variables were found to be most effective by Random Forest, as follows: motivation, product type, criteria of selection, salary, encouraged to buy eco-friendly products, information about eco-friendly products, and barriers to the buying process.
To analyze the related data, among the four feature selection techniques, Random Forest was determined to be the best with the highest accuracy value. The SHAP method was applied to identify the impact of driving factors on consumers’ motivation to buy eco-friendly products, as per Houda Lamane et al. [
21]. All analyses were conducted using the Python 3.13 programming language.
The results of the SHAP analysis indicated that consumers considering themselves well informed about eco-friendly products is the most important factor in the Environmental Protection category. The main motivations for consumers’ eco-friendly product buying behavior were allocated into three categories: Health Care, Environmental Protection, and Superior Quality.
The main hypothesis of the study is given below.
H1: The variables Type, Criteria, Salary, Encourage, Information, and Barrier have significant effects on the motivation of consumers to purchase eco-friendly products.
3.1. Data Preprocessing and Modeling
Among the sixteen variables, the five most important were selected using the Random Forest feature selection method. To apply the LightGBM learning algorithm, the dataset was split in an 80:20 ratio into training and test sets. The average performance was calculated 50 times to avoid biased results. The performance of the model was evaluated with respect to accuracy, F1-score, sensitivity, specificity, and AUC. Finally, to interpret the results regarding the main motivations for consumers to buy eco-friendly products, the SHAP method was used with the categories of Health Care, Environmental Protection, and Superior Quality.
Shapley Value (SHAP)
The SHAP concept was originally developed to estimate the importance of an individual player in a collaborative team (Lundberg and Lee [
22]; Shapley [
23], Pérez and Bajorath [
24]). According to Lamane et al. [
21], the importance of features in conventional ML only indicates the degree of influence of input variables on model output and does not reveal how input variables influence model output (Lund et al. [
25], Wang, Peng and Liang [
26]). In this sense, data science research carried out by Lamane et al. [
21] has advanced techniques to compute feature importance by incorporating “game theory” into SHAP (Strumbelj and Kononenko [
27]) to interpret the output of any black-box model, such as most ML models, by considering its inputs. This additive, consistent, and locally accurate approach is part of the explainable branch of artificial intelligence (AI) and describes the performance of any ML model (Lamane et al. [
21], Lundberg and Lee [
22]).
4. Results
Before starting the statistical analysis, four different feature selection methods were applied to the dataset to reduce the number of variables and determine the most effective (predicting) factors.
Descriptive statistics for all categories of variables and explanations are given in
Table 2.
In our analysis, as can be seen from
Figure 1, the Random Forest feature selection method achieved the greatest accuracy (0.6622) among all methods.
Table 2 indicates the precision, recall, F1-score, sensitivity, and specificity metrics obtained by the LightGBM learning algorithm. The accuracy of the model shown in
Figure 1 was calculated as 0.7755.
The results of the multi-categorical SHAP analysis are given separately for all categories of consumers’ eco-friendly product buying motivations:
Figure 2 for Environmental Protection,
Figure 3 for Health Care, and
Figure 4 for Superior Quality.
The SHAP summary plot for the
Environmental Protection category given in
Figure 2 indicates that the most influential predictors of model output are related to the level of consumer information, product type, and salary range.
Specifically, being well informed or poorly informed about sustainability issues exerts strong and opposing effects on predictions, highlighting the importance of knowledge in shaping motivation toward environmental protection. Product type variables, particularly food and drinks, as well as personal care products, also play a notable role, suggesting that environmental considerations are more salient in everyday consumption categories. Salary levels under 2500 RON and between 2500 and 5000 RON are associated with significant SHAP contributions, which implies that socioeconomic factors influence the likelihood of environmentally protective attitudes or behaviors. Furthermore, criteria such as ingredient origin and original certification, along with barriers like high price or limited availability, demonstrate that both trust in eco-labels and the accessibility of sustainable products are critical in shaping decisions. Overall, the model emphasizes that information access, affordability, and the credibility of sustainability signals are the dominant drivers of environmentally motivated choices.
The SHAP summary plot for the
Health Care category is given in
Figure 3.
Figure 3 demonstrates that certification and ingredient-related criteria exert the strongest influence on the model’s predictions, underscoring the centrality of product credibility and transparency in shaping health-related motivations. Consumers who are well informed, as well as those perceiving price reductions or increased availability, also contribute significantly to prediction outcomes, suggesting that awareness, affordability, and accessibility collectively enhance health-related decision-making. Salary levels between 2500 and 5000 RON and under 2500 RON emerge as meaningful socioeconomic differentiators, reflecting the role of income in health care motivations. Product types, particularly food and drinks and personal care products, are highly relevant, indicating that health concerns are most salient in everyday consumption domains. Conversely, barriers such as high price, lack of trust in eco-labels, and limited availability negatively influence predictions, highlighting the challenges faced by consumers in adopting health-oriented behaviors. Overall, the findings reveal that trust-building mechanisms (certification and ingredient transparency), consumer education, and economic accessibility are the dominant factors driving motivations in the health care domain.
The SHAP summary plot for the
Superior Quality category is given in
Figure 4.
Figure 4 reveals that economic considerations, particularly perceptions of high price and the incentive of price reductions, are the most influential determinants of model predictions, highlighting the centrality of affordability in shaping quality-related motivations. Salary ranges, especially 2500–5000 RON and under 2500 RON, further emphasize the role of socioeconomic status in determining sensitivity to superior product quality.
Trust-related barriers, such as lack of confidence in eco-labels and limited product availability, negatively influence outcomes, suggesting that consumer skepticism and access constraints hinder perceptions of superior quality.
Information-related variables, including being partially or well informed, along with ingredient origin and certification, also exert a significant impact, pointing to the importance of credible knowledge and transparent product attributes in shaping quality judgments.
While product categories such as personal care and food items contribute moderately, structural supports such as sustainability campaigns or consumer recommendations appear less influential in this domain.
Collectively, these findings indicate that perceptions of superior quality are primarily driven by a balance of economic accessibility, trust in labeling systems, and informational transparency.
The ROC curve calculated for all categories of motivation is given in
Figure 5.
The multi-class ROC curve given in
Figure 5 illustrates notable variation in the model’s discriminative ability across categories.
The model performs best in predicting Superior Quality (AUC = 0.88), indicating a strong capacity to distinguish positive from negative cases in this class.
Predictions for Health Care are moderately accurate (AUC = 0.71), suggesting a fair but less robust level of discrimination.
By contrast, Environmental Protection shows weak predictive performance (AUC = 0.57), only slightly above random classification, which highlights substantial limitations in capturing the underlying patterns for this category
Overall, the ROC analysis suggests that while the model is highly effective in identifying superior quality motivations, it struggles to generalize with the same accuracy in health care and especially environmental protection, underscoring the need for further refinement or targeted feature engineering in the weaker domains.
5. Discussion
The confusion matrix given in
Figure 6 highlights clear discrepancies in the model’s classification performance across categories.
The Health Care class dominates correct predictions, with 34 instances accurately classified, reflecting the model’s stronger ability to identify this category.
However, the Environmental Protection and Superior Quality classes suffer from misclassification. Specifically, most Environmental Protection cases (four out of six) and Superior Quality cases (five out of eight) are incorrectly predicted as Health Care, indicating a systematic bias toward this majority or more easily separable class.
Only one instance of Environmental Protection and three instances of Superior Quality were classified correctly, underscoring the model’s limited discriminatory power in these categories.
The classification report given in
Table 3 provides further evidence of uneven model performance across categories.
The
Health Care class achieves the highest scores, with a precision of 0.79, recall of 0.97, and F1-score of 0.87, indicating that the model is both accurate and highly sensitive in identifying this category [
27,
28].
In contrast, the Environmental Protection class performs poorly, with a precision of 0.50, recall of only 0.17, and an F1-score of 0.25, suggesting frequent misclassifications and substantial difficulty in correctly capturing true instances.
The
Superior Quality class shows moderate precision (0.75), like Jashwant Kumar et al. [
29] in their research, but weak recall (0.38), yielding an F1-score of 0.50; this imbalance implies that while predictions for this class are often correct when made, many true cases are overlooked. The macro average (F1 = 0.54) reflects the disparity between classes, while the weighted average (F1 = 0.73) is elevated by the model’s strong performance on the majority
Health Care class.
Overall, these results confirm that the model [
30] exhibits a strong bias toward the dominant class, with limited to minority categories, highlighting the need for methods such as class balancing, feature refinement, or model tuning to improve performance equity.
Sensitivity and Specificity Results
The sensitivity and specificity results reveal important asymmetries in the model’s classification capabilities, as presented in
Table 4.
For Environmental Protection, the sensitivity is very low (0.17), indicating that most true cases are missed. The specificity is very high (0.98), suggesting that the model rarely misclassifies other categories as environmental protection.
Conversely, Health Care exhibits extremely high sensitivity (0.97) but very low specificity (0.36), meaning that the model is highly effective at identifying health care cases but tends to over-predict this category at the expense of sensitivity (0.38) coupled with high specificity (0.98), reflecting a similar pattern to environmental protection, where the model struggles to capture true positives but is reliable in avoiding false positives.
Overall, these findings confirm a systematic bias toward the Health Care category, with a strong detection of cases but at the cost of misrepresenting minority categories, underscoring the need for improved balance in predictive performance.
The overall accuracy of 77.6% indicates that the model correctly classifies roughly three out of four instances across all categories. While this level of accuracy appears satisfactory at first glance, it must be interpreted with caution given the class imbalance revealed in the confusion matrix and performance metrics.
The high accuracy is largely driven by the model’s strong performance on the dominant Health Care class, which inflates the overall score, whereas the minority classes (Environmental Protection and Superior Quality) exhibit substantially weaker recall and F1-scores.
Therefore, although the model demonstrates acceptable global accuracy, its practical utility remains limited due to inconsistent performance across categories, highlighting the importance of complementing accuracy with balanced metrics when evaluating multiclass models.
6. Conclusions
The original certification category is found to be the most effective factor and most important criterion in the Health Care category of buying eco-friendly products. The most effective factor for Superior Quality is determined as the high-price category and the main barrier to purchasing eco-friendly products.
The results of the SHAP method indicate that considering oneself as being well-informed about eco-friendly products is the most important factor for Environmental Protection.
In the category of buying eco-friendly products, considering the most important criterion to be the original certification category is found to be the most effective factor of the Health Care category.
The most effective factor for Superior Quality is determined as the high-price category of the main barrier to purchasing eco-friendly products.
On the other hand, consumers are losing trust in concepts such as “natural” and “organic. Therefore, companies are advised to use original documents from independent and recognized organizations in their product certification processes and to clearly share the verifiability of these documents. Furthermore, environmentally friendly labels used on product labels should be based on concrete criteria, and this information should be made accessible online. High prices are one of the most significant barriers to purchasing environmentally friendly products. Companies should strengthen their local supply chains to reduce costs. They should leverage economies of scale in the use of recycled or renewable materials and develop special pricing or campaigns for low-income groups. Many consumers are unaware of what true “eco-certification” means. Companies should explain their environmentally friendly production processes to consumers through social media campaigns, interactive web content, or educational videos, thereby increasing consumer knowledge and confidence. The low availability of sustainable products limits consumer choices. In this regard, companies may be advised to expand their distribution networks, make environmentally friendly products more visible on e-commerce platforms, and create special “green shelf” applications in their retail chains.
The study wanted to present a model of the application of the benefits of artificial intelligence that can help improve customer satisfaction, analyzing customer engagement, demographic data and behavior. Machine learning helps marketers predict which potential customers to prioritize, improving the effectiveness of the strategy. Locally, promoting small traditional local producers calls for more aggressive promotion, as well as social media involvement. The launch of the Made in Romania program supports Romanian products.
Aligning with AI, it was possible to undercover what lies at the intersection of technological innovation, ethics and societal impact on our daily life by obtaining more intense results. Challenges in using the AI technique are beneficial, with impact on approaches to AI, the future of AI, understanding AL and progress in AL (see
Figure 7).
The challenge in artificial intelligence is in building a future in which artificial intelligence systems contribute positively to society while respecting humanity’s values and preferences.
Opportunities using artificial intelligence include the following:
Efficiency in multiple areas, such as medicine, business, and customer loyalty.
Increased productivity.
Increased innovation and creativity.
Automation and business growth.
Higher return on investment.
Stronger customer relationships.
Improved brand loyalty.
Positive impacts on consumers.
The pillars of developing artificial intelligence regarding its contribution to research are presented in
Figure 7.