4. Results
Table 3 displays the Measurement criteria of quality for the conceptual model—performance expectancy (PE), effort expectancy (EE), social influence (SI), and behavioural intention (BI), as well as Perceptions of Digital Sustainability Requisites (PDS), excluding the moderating variable (AI awareness).
Table 3 indicates that the reliability of all item loadings for the measured constructs exceeds the commonly accepted threshold of 0.70, suggesting that each item reliably reflects its corresponding latent variable.
Likewise, Performance Expectancy (PE) demonstrates strong loadings across all five indicators (PE1 to PE5), ranging from 0.791 to 0.876. Thus, this confirms that each item makes a significant contribution to the construct it is intended to measure. Additionally, the construct also exhibits excellent convergent validity with an Average Variance Extracted (AVE) of 0.717. Furthermore, the Composite Reliability (CR = 0.927) and Cronbach’s alpha (α = 0.921) values confirm a high degree of internal consistency.
Similarly, Effort Expectancy (EE) displays item loadings ranging from 0.774 to 0.871. Hence, these values also surpass the 0.70 reliability threshold, supporting strong indicator reliability. Moreover, the AVE for EE is 0.701, indicating that over 70% of the variance in its observed variables is explained by the underlying construct. Additionally, internal consistency is high, as evidenced by Cronbach’s alphas of 0.909 and 0.921. Moreover, these findings confirm the soundness of EE as a construct within the measurement model.
Furthermore, the construct of Social Influence (SI) demonstrates slightly lower but still acceptable item loadings, ranging from 0.744 to 0.869. Additionally, the AVE value of 0.665 remains above the minimum recommended level of 0.50, indicating sufficient convergent validity. Furthermore, SI shows a Cronbach’s alpha of 0.908 and a CR of 0.899, confirming that the items consistently measure the intended latent concept.
On the other hand, Behavioural Intention (BI) exhibits exceptionally high indicator loadings, ranging from 0.952 to 0.984, suggesting outstanding reliability at the item level. Additionally, with an AVE of 0.946, this construct captures nearly all the variance in its indicators. Moreover, the CR (0.981) and Cronbach’s alpha (0.972) values for BI are remarkably high, further establishing its robustness and reliability within the model.
In addition to the core constructs, the extended model incorporates Perceptions of Digital Sustainability Requisites (PDS) as a key outcome variable. Likewise, all three items (PDS1 to PDS3) exhibit very high factor loadings, ranging from 0.853 to 0.921, which is well above the reliability threshold of 0.7. Moreover, the AVE for this construct is 0.808, indicating that over 80% of the variance in the observed indicators is accounted for by the underlying construct. Additionally, internal consistency is confirmed, with a Cronbach’s alpha of 0.890 and a coefficient alpha of 0.927.
Therefore, these findings provide strong evidence for the empirical validity of PDS and highlight its significance in connecting behavioural intention with the sustainable and responsible integration of AI technologies in educational settings. The results show that the constructs in the measurement model meet or surpass the criteria for indicator reliability, convergent validity, and internal consistency. Furthermore, these findings confirm the psychometric robustness of the model and support its use for further structural model assessment and hypothesis testing.
Table 4 presents factor cross-loadings analysis used to assess discriminant validity within the PLS-SEM framework. Furthermore, PLS–SEM examines whether each indicator demonstrates its strongest association with its theoretically assigned construct relative to all other constructs in PLS-SEM. Consistently, with established methodological standards, discriminant validity is confirmed when each indicator’s loading is substantially higher on its corresponding latent construct than on any other construct in the analysis [
54].
Table 4 provides strong evidence for discriminant validity across all measured constructs. Performance Expectancy (PE) indicators (PE1–PE5) all have loadings above 0.79 on their respective construct and notably lower on the others. Additionally, each item demonstrates its highest loading on the PE construct, confirming the distinctiveness of this latent variable within the Structural Equation Modelling (SEM) framework.
Similarly, the Effort Expectancy (EE) indicators (EE1–EE5) display loadings greater than 0.77 on the EE construct, while their cross-loadings on other constructs remain comparatively lower. Furthermore, each EE item is most strongly associated with its construct, reinforcing discriminant validity. Thus, the pattern is consistent and evident for all EE indicators.
Instance Social Influence (SI) indicators (SI1–SI5) show loadings above 0.74 on the SI construct, with smaller cross-loadings on other latent variables. Likewise, each SI item loads highest on its intended construct, confirming the distinct measurement of social influence in the model.
Additionally, the Behavioural Intention (BI) indicators (BI1–BI3) exhibit exceptionally high loadings, ranging from 0.952 to 0.984, on the BI construct, and substantially lower values across the remaining constructs. This strong loading pattern affirms excellent construct separation and supports the reliability of BI as a core outcome variable in the model.
Additionally, the Perceptions of Digital Sustainability Requisites (PDS) model introduces the construct Perceptions of Digital Sustainability Requisites (PDS), which also demonstrates robust discriminant validity. Also, PDS indicators (PDS1–PDS3) exhibit high loadings of 0.853 to 0.921 on their construct, while their loadings on PE, EE, SI, and BI remain considerably lower.
By results, PDS1 loads 0.892 on the PDS construct, compared to values below 0.53 on other constructs. These findings indicate that each item is uniquely aligned with the PDS factor, validating its distinct role in the model and affirming its theoretical contribution to assessing the sustainable adoption of AI-NLP tools (such as Copilot/ChatGPT) in education.
Overall, items across the constructs PE, EE, SI, PDS, and BI load significantly on their respective latent variables and lower on all others. Hence, the results confirm the validity of the entire measured construct’s discrimination and demonstrate the soundness of the measurement model. Furthermore, to ensure a comprehensive assessment of discriminant validity, the findings were cross-validated using both the Fornell–Larcker criterion and the Heterotrait–Monotrait (H.T.M.T) ratio.
As
Table 5 shows, the square root of the AVE for each construct (diagonal values) exceeds its correlations with all other constructs (off-diagonal values), thereby confirming discriminant validity within the measurement model.
Furthermore, the square root of AVE for Effort Expectancy (EE) equals 0.837, which is greater than its highest correlation with any other construct, namely, Performance Expectancy (PE) at r = 0.597. Likewise, the square root of AVE for Performance Expectancy (PE) equals 0.846, which exceeds its correlation with Behavioural Intention (BI) (r = 0.698), thereby supporting the distinctiveness of the construct.
Similarly, Social Influence (SI) demonstrates a square root of AVE of 0.815, which surpasses its strongest correlation with BI (r = 0.464), confirming its uniqueness as a construct in the model. Moreover, the construct of Behavioural Intention (BI) also meets the Fornell-Larcker criterion, with a square root of AVE of 0.973, substantially higher than its correlations with other constructs, further affirming discriminant validity.
Additionally, the extended model includes the construct Perceptions of Digital Sustainability Requisites (PDS), which also satisfies the Fornell–Larcker criterion. Likewise, the square root of AVE for PDS is 0.899, clearly greater than its correlations with other constructs, such as PE (r = 0.508), EE (r = 0.492), SI (r = 0.473), and BI (r = 0.487). Thus, a substantial gap between the AVE square root and inter-construct correlations confirms that PDS is empirically distinct and conceptually relevant to the measurement model, especially in assessing the sustainable integration of AI-NLP tools (such as Copilot/ChatGPT) in educational settings.
Table 6 presents the results of the Heterotrait–Monotrait Ratio (H.T.M.T) analysis, an established method for assessing discriminant validity in structural equation modelling (Henseler et al., 2015). All H.T.M.T values were substantially below the conservative threshold of 0.85 (with an alternative threshold of 0.90), demonstrating a clear empirical distinction between constructs. Notably, key construct pairs showed particularly strong discriminant validity, including PE ↔ BI (0.754) and EE ↔ BI (0.635). Moreover, the strongest discriminant separation was observed between SI ↔ EE (0.443). Hence, these results collectively confirm that all construct pairs meet the H.T.M.T criteria for discriminant validity.
In addition to the Fornell–Larcker criterion, the Heterotrait–Monotrait Ratio (H.T.M.T) was employed to assess discriminant validity among the latent constructs. Regarding the H.T.M.T criterion, values below the conservative threshold of 0.85 indicate sufficient discriminant validity.
Additionally, all construct pairs exhibit H.T.M.T values well below this threshold, thereby confirming that each construct is statistically distinct from the others in the model. Furthermore, the H.T.M.T values for key construct pairs such as Effort Expectancy ⟷ Behavioural Intention (0.635), Performance Expectancy ⟷ Behavioural Intention (0.754), and Social Influence ⟷ Effort Expectancy (0.443) are all within acceptable bounds.
Likewise, the results reinforce the finding that each construct represents a unique aspect of a specific theoretical dimension within the model. Moreover, the H.T.M.T analysis further supports the robustness of the model’s discriminant validity. All H.T.M.T values involving PDS are well below the 0.85 threshold: PDS ⟷ Behavioural Intention (0.512), PDS ⟷ Effort Expectancy (0.529), PDS ⟷ Performance Expectancy (0.545), and PDS ⟷ Social Influence (0.498).
As
Table 7 shows, the results of the structural model analysis include path coefficients (β), t-values,
p-values, and R
2 values. Likewise, the findings were obtained using the bootstrapping procedure to evaluate both the direct effects (H1–H3 and H7) and the moderation effects (H4–H6) within the extended Unified Theory of Acceptance and Use of Technology (UTAUT) model.
Additionally, the analysis confirms that the four core predictors—PE, EE, SI, and BI—have positive and statistically significant effects within the model. Behavioural Intention (BI) significantly influences learners’ Perceptions of Digital Sustainability Requisites (PDS) in the context of using AI-NLP tools (such as Copilot and ChatGPT) in educational settings.
Notably, the moderating effects of AI awareness on the relationships between PE, EE, and SI with BI (H4–H6) were not statistically significant (p > 0.05), despite positive β values. This indicates that students’ awareness of AI-NLP tools may not significantly alter how these core constructs influence their intention to adopt such tools.
Furthermore, although students report general awareness of AI-NLP tools, as a possible explanation, students’ reported AI awareness may be too superficial or uniformly distributed to create meaningful variance in behavioural responses [
56]. Additionally, AI awareness might conceptually overlap with the core predictors—particularly PE—thereby diminishing its distinct moderating effect. Hence, these findings emphasise the need for more targeted AI literacy initiatives to improve not only familiarity but also critical understanding of AI systems, especially within medical education contexts.
Additionally, future studies should use longitudinal tracking to examine how students’ AI literacy evolves, also assessing whether deeper conceptual understanding, beyond surface-level AI awareness, more effectively enhances or moderates adoption behaviour.
Furthermore, these results provide strong empirical support for hypotheses H1 through H3 and H7, indicating substantial explanatory power. All VIF values were well below the threshold, confirming. In contrast, the analysis confirms a strong and significant positive effect of BI on students’ PDS (H7: β = 0.72, p < 0.01, R2 = 0.51). This finding indicates that students who are more willing to adopt AI tools such as ChatGPT and Copilot are also more likely to recognise the importance of responsible and sustainable AI integration in education.
This highlights the role of behavioural intention not only in technology adoption but also in advancing broader sustainability goals. It further suggests that promoting the informed and intentional use of AI among students may be a critical pathway to achieving digital sustainability aligned with the Sustainable Development Goals (SDG 4 and SDG 9). These insights may inform institutional policies on embedding digital sustainability within AI adoption frameworks in higher education.
5. Discussion
This study contributes to the emerging literature on artificial intelligence in medical education by examining how learners perceive AI-powered Natural Language Processing (NLP) tools (e.g., ChatGPT and Copilot) within the context of digital sustainability. Based on the Unified Theory of Acceptance and Use of Technology (UTAUT), the findings offer both theoretical and practical insights into the behavioural drivers behind AI-NLP adoption among medical students.
The results indicate that all three core predictors—performance expectancy (PE), effort expectancy (EE), and social influence (SI)—significantly and positively influence learners’ behavioural intention (BI) to adopt AI-NLP tools. Among these, PE emerged as the most influential, suggesting that students prioritise tools that demonstrably improve educational outcomes. Additionally, respondents indicated that AI-NLP tools are effective for enhancing academic performance, engagement, and task achievement, findings that align with earlier studies on technology adoption in educational settings [
11,
38,
39].
Effort expectancy (EE) also played a critical role, highlighting that ease of use remains a significant factor in acceptance. Students with higher levels of digital literacy and fewer perceived barriers were accepting of integrating AI tools into their learning routines, consistent with prior research [
28,
32,
57]. These results confirm that accessible, user-friendly interfaces are not just desirable but essential for meaningful student engagement.
Social influence (SI) refers to the impact of peers, such as friends and colleagues, on students’ adoption of AI-NLP tools. This influence operates through three main mechanisms: shaping personal beliefs, promoting conformity, and motivating behaviour. The results suggest that AI tools are widely accepted and commonly used among participants, which increases the likelihood that students will adopt them. This aligns with previous findings [
18,
32,
57,
58].
A particularly novel aspect of this study is the significant association between students’ behavioural intention and their perceptions of digital sustainability (PDS). This extends the UTAUT model by explaining how learners’ motivations are not only driven by utility, but also by ethical and environmental considerations. Moreover, students with higher adoption intentions also reported more substantial alignment with sustainable digital practices, indicating that AI integration in education is increasingly viewed through the lens of social responsibility. These findings resonate with research emphasising the connection between digital adoption, environmental responsibility, and educational equity [
40,
59,
60,
64].
Contrary to theoretical expectations, AI awareness did not significantly moderate the relationship between UTAUT predictors (PE, EE, SI) and BI. One reasonable explanation is that medical students already exhibit a high level of functional familiarity with tools such as AI-NLP, which may create a significant effect. This widespread baseline knowledge may reduce variance in responses, thereby diminishing the moderating role of AI awareness.
Furthermore, much of this awareness may be operational (e.g., how to use tools) rather than critical (e.g., understanding their ethical, social, or sustainability implications). General familiarity alone is insufficient to alter behavioural intention unless coupled with essential digital literacy. As prior studies suggest [
14,
47,
61,
62].
Collectively, the results suggest that the adoption of AI-NLP tools in medical education is driven primarily by perceived usefulness and ease of use, with students’ self-reported AI awareness playing a marginal role. This underscores the need to shift from passive exposure to structured AI literacy initiatives that cultivate a deeper understanding of both technical functionalities and broader ethical and sustainability considerations.
Beyond educational benefits, AI-NLP tools can enhance medical practice by improving diagnostic accuracy, streamlining clinical documentation, and reducing administrative burdens, contributing to more efficient and sustainable healthcare systems. Thus, integrating AI into medical training is not merely a technological upgrade but a strategic imperative for preparing future healthcare professionals to innovate responsibly.
Overall, to ensure meaningful adoption, medical educators, researchers, and policymakers should embed AI and digital sustainability literacy into formal curricula. Additionally, modules and structured programs must equip learners with not only technical proficiency but also the ethical discernment and sustainability awareness needed to harness AI’s potential in clinical and academic settings.
6. Conclusions
Artificial Intelligence (AI) is transforming medical education by enhancing students’ digital competencies and supporting Sustainable Development Goal (SDG) 4, which aims to promote inclusive, equitable, and quality education for all. Moreover, AI-powered Natural Language Processing (AI-NLP) tools, such as ChatGPT and Copilot, facilitate personalised learning, provide real-time feedback, and organise knowledge management processes.
Additionally, to prepare responsible and future-ready healthcare professionals, medical schools should embed AI literacy and digital sustainability into their curricula. This can be achieved through structured academic programs, interdisciplinary modules, and hands-on learning experiences that develop both technical skills and awareness.
Notably, policymakers in higher education, especially those in medical schools, should not view AI merely as a technological tool but as a strategic driver of ethical, sustainable, and high-quality medical education. Likewise, AI integration with broader educational and environmental objectives can enable medical schools to prepare a new generation of digitally skilled, ethically grounded, and socially responsible healthcare professionals.
Furthermore, future studies should explore how medical learners’ AI awareness evolves and how it shapes their behavioural intentions. Longitudinal studies would provide valuable insights into these dynamics. Cross-cultural and institutional comparisons are also recommended to evaluate the generalisability of current findings and uncover context-specific variations. Moreover, mixed-methods approaches, such as qualitative interviews and observational studies, can offer a deeper understanding of how AI-NLP tools affect teaching and learning experiences.
Additionally, research is necessary to evaluate the digital sustainability outcomes of AI adoption in medical education, including reductions in resource consumption and improvements in digital efficiency. Furthermore, incorporating the perspectives of faculty and administrators will be essential in developing a holistic and sustainable model for AI implementation in healthcare education.
Policymakers should support AI integration by establishing regulatory frameworks, sustainable funding mechanisms, and infrastructure policies that address key challenges, such as energy consumption, data privacy, and digital dividing. These efforts will help ensure that AI adoption promotes both educational excellence and environmental responsibility.
7. Future Study Opportunities and Limitations
While this study provides important insights into medical learners’ behavioural intentions to adopt AI-powered NLP tools and their perceptions of digital sustainability, several limitations should be acknowledged.
First, the use of a cross-sectional survey design limits the ability to draw causal inferences between constructs. As such, future research should adopt longitudinal designs to explore how these relationships evolve and to verify potential causal pathways between AI adoption behaviours and sustainability perceptions.
Second, the study sample was restricted to medical students in Saudi Arabia. Although this focus offers valuable regional insights, it may limit the generalizability of findings. Future research should incorporate cross-cultural and cross-disciplinary comparisons to assess whether adoption patterns and sustainability attitudes differ across educational systems, professional contexts, and cultural settings.
Third, the reliance on self-reported measures of digital sustainability presents a methodological limitation. Future studies should consider mixed methods approaches, integrating qualitative interviews and objective data sources (e.g., energy usage analytics, institutional sustainability reports, or digital footprint assessments) to strengthen the validity and contextual depth of the findings.
Furthermore, although the study employed validated scales adapted from prior research, future studies should further assess the internal consistency and construct validity of these instruments across broader and more diverse samples. Expanding the number of items for key constructs, such as perceptions of digital sustainability and AI Awareness, may enhance measurement depth and improve the robustness of model estimations. This would help ensure that theoretical constructs are captured with greater nuance, particularly within rapidly evolving educational environments that are shaped by AI tools.
Finally, while student perspectives are central to understanding the adoption of AI in education, a comprehensive understanding also requires examining the views of other stakeholders. Future research should include faculty experiences, administrative considerations, and institutional policy frameworks to develop a more holistic and sustainable model for AI integration in medical education.
Additionally, to enhance the practical application of this research, policymakers in higher education are advised to develop national strategies that integrate AI literacy and digital sustainability into their policies. Medical schools are encouraged to implement structured curricular frameworks that incorporate awareness of AI and sustainability-focused digital practices, aligning with SDG 4 and SDG 9.