1. Introduction
AI is a broad field within computer science that focusses on the development of systems capable of performing tasks that normally require human intelligence, such as problem solving, reasoning, learning and decision making [
1,
2]. AI encompasses various subfields, including machine learning, natural language processing and computer vision, which enable machines to analyse data, recognise patterns and generate answers autonomously. One of the most transformative advances within AI is generative AI, which refers to AI models that are specifically designed to create new content such as text, images, audio and even code, rather than simply processing or analysing existing data [
3,
4]. Unlike traditional AI systems, that primarily help with in classification, prediction and automation, generative AI utilises deep learning techniques, such as large language models, to generate human-like responses and adapt to different input contexts [
5,
6,
7].
The rise of generative AI has created new opportunities for students to engage with course content, automate repetitive tasks and increase creativity [
4]. These tools have become prevalent in the academic settings, enabling personalised tutoring, writing assistance and real-time feedback for students [
6,
8]. In contrast to traditional AI applications that focus on tasks such as medical diagnostics, robotic process automation or data analysis, generative AI actively creates new content rather than merely interpreting or classifying existing information [
9].
In nursing education, generative AI can be particularly valuable as it complements traditional pedagogical approaches and provides AI-powered simulations, adaptive learning tools and virtual assistants that improve clinical reasoning and decision making [
6,
10]. For example, AI-supported simulations can replicate complex patient scenarios and allow students to practise decision making in a risk-free environment. This transformation is changing the way health science students’ access, process and apply knowledge, preparing them for the changing demands of the healthcare industry [
2,
6,
8,
11]. However, the integration of generative AI tools into nursing education is still in its infancy and requires a thorough assessment of its benefits and limitations [
3,
12]. Nursing is an evidence-based profession that requires critical thinking, problem solving and hands-on clinical experience. As AI is increasingly integrated into healthcare, supporting patient monitoring, diagnostics and decision making, it is imperative that health science students develop AI skills [
3,
10]. AI-driven clinical decision support systems and predictive analytics are already being integrated into modern healthcare, making AI skills essential for future nurses [
13].
Several studies show that health science students are increasingly using generative AI tools for academic purposes, such as research, data analysis, writing assistance, grammar checking, creation of study materials and simulated patient interactions [
4,
6]. However, privacy, ethical use and academic integrity concerns remain, requiring further research and institutional guidance [
8]. A growing body of research also suggests that student readiness to use generative AI varies due to gaps in technological literacy, lack of time and institutional policies [
10,
11,
14]. The literature suggests that generative AI can enhance nursing education by improving access to academic resources, increasing engagement through AI-powered tutoring systems and providing real-time feedback to refine academic writing [
4,
7,
15]. In addition, AI-supported writing tools have been introduced in nursing education where students are encouraged to critically evaluate AI-generated content rather than passively accept it [
6,
16].
Building on this, several recent studies have explored the diverse and evolving applications of generative AI in higher education. These include its pedagogical potential for personalised learning, academic productivity and interdisciplinary engagement, as outlined in a thematic review by Yusuf et al. [
17], and its integration into immersive digital environments such as virtual simulations and intelligent tutoring systems, particularly in medical and health science education [
18]. Others have explored the benefits of generative AI for specific groups of learners. Ronksley-Pavia et al. [
19] discuss its potential for supporting neurodiverse learners, albeit with the caveat of overconfidence and the need for human supervision. Ethical and affective aspects were also emphasised. For example, Schilke and Reimann [
20] found that increased transparency can paradoxically reduce trust in AI systems, while Duran et al. [
21] used sentiment analysis to demonstrate that students’ ethical concerns strongly influence their attitudes towards the use of AI in education. Qutishat et al. [
22] looked specifically at the context of nursing, emphasising both the educational benefits of AI-enhanced simulations and institutional or financial constraints to their wider application. From a policy and governance perspective, Andersen et al. [
23] argue for field-specific policies that reflect the variability of research phases across disciplines and Lünich et al. [
24] emphasise the tensions between public and student perceptions of data use and transparency in higher education. Finally, Fine Licht [
25] questions moral prohibitions on generative AI in higher education and instead argues in favour of ethical education based on responsibility and practicality.
Despite these advantages, some challenges remain. Ethical concerns related to allegations of plagiarism and overreliance on AI-generated content have been widely discussed [
8,
11,
26]. Ronksley-Pavia et al. [
19] point out that, while AI tools provide personalised support, they can also encourage passive consumption and raise concerns about the accuracy of AI-generated information and overreliance on automation. Studies suggest that generative AI tools can create “hallucinations”, i.e., factually incorrect or misleading content that can impair students’ learning and professional judgement [
6,
10]. In addition, AI models often struggle with bias as they are trained on datasets that may not represent diverse patient populations, which can lead to biassed outcomes in healthcare decision making [
4,
11,
27]. Another major challenge is to ensure that health science students have generative AI skills or AI competency. Many students lack formal training in AI, which can lead to misuse or underutilisation of these tools [
6]. This knowledge gap in the field of AI extends to teachers as well. Some educators express concerns about their ability to effectively integrate AI into curricula [
10,
28,
29]. Financial constraints further complicate the adoption of AI, as educational institutions must devote significant resources to purchasing and maintaining AI technologies, training teachers and developing AI-related courses [
8,
11,
30,
31].
In recent literature, generative AI in education is also questioned across the board. Fine Licht [
9] argues that restrictive measures are often based on insufficiently reasoned moral arguments and recommends instead a balanced, responsible integration of generative AI into curricula. Similarly, Andersen et al. [
23] emphasise the need for field-specific, flexible guidelines that are guided by disciplinary norms and evolving research practices, rather than ethical codes that are the same for everyone.
On the way to achieving sustainable development goals, the integration of generative AI should be both future-orientated and ethically responsible. Sustainable adoption involves not only providing equal access and promoting digital literacy but also helping students develop the critical judgement needed to use these tools wisely, especially in the health sciences, where ethical competence is as important as technical skills. Empirical evidence from a recent study by Gonzalez-Garcia et al. [
32] provides further insight into this dynamic. Their findings show that the majority of nursing students regularly use ChatGPT-4 (OpenAI) for academic purposes, particularly for clarifying concepts and writing assignments. However, the study also highlights a worrying lack of critical judgement among students, with many relying on AI-generated results without checking their accuracy. The authors emphasise the need for curricular measures that promote ethical awareness, critical thinking and responsible use of AI in nursing education [
32].
Aim of the Study
The integration of generative AI into nursing education is inevitable, but its application remains controversial and unexplored. While AI enhances learning opportunities, it also raises concerns about ethical use, overdependence and misinformation. The aim of this study was to investigate the perceptions and challenges of health science students in relation to the use of AI in their studies. Specifically, the study aimed to: (1) examine the patterns of generative AI use among health sciences students; (2) assess students’ self-reported competence and confidence in using AI for academic tasks; (3) identify the perceived benefits and limitations of generative AI in nursing education; (4) explore ethical concerns and issues of academic integrity related to AI; and (5) assess students’ attitudes towards the role of AI in their learning and future professional practice.
These aims are reflected in the following research questions:
- –
How frequently and for what academic purposes do health sciences students use generative AI tools in their studies?
- –
What are the attitudes of health science students towards the ethical, pedagogical and trust-related aspects of using of generative AI in higher education?
- –
What contextual and institutional factors influence students’ willingness and responsible integration of generative AI in learning processes?
2. Materials and Methods
In this study, a descriptive cross-sectional design was used to investigate the perceptions and challenges of health science students in relation to the use of generative AI [
33].
2.1. Setting and Sample
The study was conducted in Slovenia at the Faculty of Health Sciences (University of Primorska), a higher education institution specialising in health sciences. It involved students enrolled in four health-related study programmes: Applied Kinesiology, Physiotherapy, Nursing and Dietetics. In the 2024/2025 academic year, a total of 619 students were enrolled for the first time in all three academic years: 132 in Applied Kinesiology, 147 in Physiotherapy, 231 in Nursing and 109 in Dietetics (
Table 1).
A convenience sample comprised 397 students, which corresponds to approximately 64.1% of the eligible population (n = 619). Of the remaining students, 54 did not access the online questionnaire at all, while 146 students opened the link to the questionnaire but did not complete it and 22 responses were excluded due to incompleteness. Only fully completed questionnaires were included in the final analysis. An a priori power analysis was conducted using G*Power 3.1 software to determine the minimum sample size required. Assuming a medium effect size (Cohen’s d = 0.5), a one-sided t-test for independent samples, α = 0.05 and a statistical power of 0.95, the analysis resulted in a minimum required sample size of 176 participants (88 per group). The sample size of 397 participants collected exceeded this threshold, ensuring sufficient statistical power for the hypothesis tests. In addition, a sample size estimate was performed to assess representativeness for descriptive conclusions. Assuming a conservative expected response distribution of 50%, a 95% confidence level and a 5% margin of error, the required minimum sample size was calculated to be 238 participants. The sample obtained thus provided sufficient coverage to ensure the robustness of the descriptive statistics.
In addition, the sample size was sufficient for the psychometric validation procedures, including exploratory factor analysis (EFA) and assessment of internal consistency and reliability. Methodological guidelines recommend a respondent-to-item ratio of at least 5:1 to 10:1 for EFA [
34], which corresponds to a required sample size of 150 to 300 participants for a 30-item scale. The achieved sample size of 397 participants provided a solid empirical basis for scale development. The reliability assessed with Cronbach’s alpha is also considered stable for samples with more than 200 participants [
35].
2.2. Questionnaire Design
The initial version of the instrument was developed based on a review of current literature examining students’ attitudes, usage behaviour and concerns about the use of AI in higher education. Five key constructs identified in several empirical studies formed the conceptual basis for the development of the items: perceived benefits [
36,
37], integration into study routines [
38], ethical responsibility [
39], concerns and doubts about AI [
40] and privacy and data protection [
41].
The original version of the instrument consisted of 30 items, which were formulated as declarative statements and rated on a five-point Likert scale from 1 (strongly disagree) to 5 (strongly agree). Higher mean scores reflect a more positive or committed attitude towards AI in education, depending on the specific subscale: stronger perception of benefits, stronger integration into academic practise, stronger ethical responsibility and stronger awareness of or concern about risks related to privacy or misuse. Negatively worded items were reverse coded to ensure consistency of interpretation across the scale.
Content validity was determined based on the judgement of a panel of six experts (three women and three men aged between 37 and 53 years, with an average age of 44.5 years), all of whom had a professional background in health sciences and experience in educational research or curriculum development. Items whose item-level content validity index (I-CVI) was below 0.83 were revised or excluded, in accordance with established thresholds for expert agreement [
42]. Scale-level content validity (S-CVI/Ave) was high (0.96) and both Fleiss’ kappa (κ = 0.67) and Kendall’s W (W = 0.57,
p < 0.001) confirmed considerable inter-rater agreement.
The EFA was first conducted to assess the construct validity of the preliminary 30-item version of the instrument. The extraction method used was principal axis factoring with Promax rotation and Kaiser normalisation, which was chosen because of the expected correlation between the factors and the theoretical basis of the scale. The suitability of the data for factor analysis was confirmed using the Kaiser–Meyer–Olkin (KMO) measure = 0.899 and Bartlett’s test of sphericity (χ
2 = 3641.637,
p < 0.001). Factors were retained based on a combination of the Kaiser criterion (eigenvalues > 1), inspection of the scree plot and the results of a parallel analysis, which overall supported a five-factor solution (
Table S1 and
Figure S1). Three items were removed due to low commonalities, cross-loadings and lack of theoretical coherence, resulting in a final instrument with 27 items.
A confirmatory factor analysis was then conducted on the same dataset (N = 619) using the robust weighted least squares estimation method (WLSMV). All factor loadings were above 0.40, indicating an adequate representation of the items. The model fit indices indicated an acceptable fit of the five-factor model: CFI = 0.934, TLI = 0.896, RMSEA = 0.058 and SRMR = 0.033. Although the TLI did not exceed the recommended threshold of 0.95, it remained close to the acceptable value, especially considering the exploratory nature of the scale. Before calculating Cronbach’s alpha, the assumptions were checked and fulfilled: inter-item correlations ranged from 0.31 to 0.68, indicating adequate item homogeneity without redundancy, and all corrected item-total correlations were above the threshold of 0.30. The Cronbach’s alpha for the 27-item scale was 0.90, with individual subscale values ranging from 0.73 to 0.88, indicating strong internal consistency. Seven items were reverse coded (F4—Concerns and doubts about the use of AI and F5—Privacy and data protection in the use of AI) to ensure conceptual accuracy and reduce acceptance bias.
Due to practical constraints, the same sample was used for both the EFA and the CFA. Although this approach may limit the generalisability of the model, it was deemed acceptable for this initial scale development. This decision is recognised as a limitation and suggests that further validation on an independent sample is warranted [
43]. Future studies should attempt to validate the instrument using an independent sample. To reflect the content and purpose of the scale, the final version of the scale was named Student Attitudes Toward Artificial Intelligence in Higher Education (SAAIHE). SAAIHE is a 27-item self-report instrument designed to measure higher education students’ perceptions, behavioural tendencies, ethical considerations and concerns regarding the use of generative AI tools in their academic work. It consists of five empirically supported subscales. Higher total scores indicate a more positive and responsible attitude towards the integration of AI in higher education.
2.3. Data Collection and Data Analysis
The data was collected via an online questionnaire provided on the secure academic platform 1ka.si. The survey was available between November 2024 and February 2025. Once the survey was approved, the Office of Student Affairs distributed an email invitation with the survey link to all eligible students. To increase the response rate, reminder emails were sent approximately every 14 days throughout the data collection period.
The data was analysed using IBM SPSS Statistics version 29.0 and Jamovi (The Jamovi Project) version 2.6.26 (open statistics platform). Descriptive statistics were used to summarise the demographic characteristics and responses to the items. The distribution of the data was assessed using the Kolmogorov–Smirnov test, which showed that the variables were not normally distributed. Therefore, non-parametric tests were used in the subsequent analyses. Wilcoxon signed-rank tests were used to analyse whether students’ attitudes differed significantly from the neutral reference median. The Mann–Whitney U test was used to assess differences between genders, while Kruskal–Wallis tests were used to examine differences between degree programmes, years of study, frequency and rate of AI tool use and students’ self-rated skill and confidence. Pairwise comparisons were performed using the Dwass–Steel–Critchlow–Fligner post hoc test. Two separate multiple linear regression analyses were conducted to identify predictors of perceived usefulness of AI tools. The first model included frequency and duration of use, while the second model included self-rated skill and confidence. All statistical tests were two-tailed tests with a significance level of 0.05.
2.4. Ethical Considerations
Ethical approval for this study was obtained from the Commission of the University of Primorska for Ethics in Human Subjects Research (approval number: 4264-16-3/2022). The Faculty of Health Sciences of the University of Primorska also granted permission for the dissemination of the survey via student emails. The survey was in accordance with the EU General Data Protection Regulation (GDPR) on data protection. Participation was completely voluntary and anonymous. Informed consent was obtained electronically before the survey began and an introductory statement outlining the purpose of the study and ethical considerations was provided at the beginning of the questionnaire. Only the lead investigator had access to the raw data. Participants also had the right to skip any questions they did not wish to answer and could withdraw from the study at any time without consequence.
3. Results
To investigate the extent and direction of students’ attitudes towards the use of AI in their academic learning, we conducted a Wilcoxon signed-rank test. The aim of the analysis was to determine whether the observed median of each factor deviated significantly from a neutral reference point (median = 3), which represents an ambivalent or undecided attitude towards AI (
Table 2).
Median subscale scores ranged from 3.30 to 4.00, indicating generally moderate to positive perception of AI use among students, including lower levels of concern and greater perceptions of ethical responsibility and privacy awareness. The results of the Wilcoxon signed-rank test showed that all five factors, as well as the overall SAAIHE scale, differed significantly from the reference median in either a positive or negative direction. The value of Perceived usefulness and support of AI in learning (F1) was significantly higher than the reference value (W = 19,401.5, z = 7.84, p < 0.001), with a large effect size (r = 0.61). Similar results were observed for Integration of AI into the study routine (F2; W = 18,386.5, z = 6.44, p < 0.001, r = 0.50) and for Ethics and responsibility in the use of AI (F3; W = 24,495.0, z = 12.09, p < 0.001), which showed a very large effect size (r = 0.93). The results also showed significantly lower scores for Concerns and doubts about the use of AI (F4) and Privacy and data protection in the use of AI (F5), with W = 4937.5, z = −6.64, p < 0.001, r = 0.53 and W = 2719.0, z = −8.19, p < 0.001, r = 0.68, respectively. As these constructs were reverse coded, lower scores indicate a stronger rejection of negative statements and therefore reflect lower levels of concern.
Students’ ratings on the SAAIHE scale showed a statistically significant positive attitude towards the use of generative AI in higher education (W = 21,008, z = 7.49, p < 0.001, r = 0.57), indicating that, on average, students have a positive attitude towards the use of AI in health sciences education.
Two separate multiple linear regression analyses were conducted to examine the extent to which (1) frequency and duration of AI tool use and (2) students’ self-rated ability and confidence in using AI tools independently predicted perceived usefulness of generative AI in academic tasks (
Table 3). The results showed that both frequency and duration of AI tool use significantly predicted perceived usefulness. Specifically, more frequent use was associated with higher perceived usefulness, while longer use also had a positive relationship with perceived usefulness. This model explained 27.8% of the variance in perceived usefulness (adjusted R
2 = 0.272, F(2, 222) = 42.78,
p < 0.001). The second model focussed on the students’ self-rated skill and confidence in using AI tools. Both predictors contributed significantly to the model. Students with higher self-rated skill tended to perceive AI tools as more useful. In contrast, lower confidence in using AI tools was associated with lower perceived usefulness. This model explained 24.5% of the variance in perceived usefulness (adjusted R
2 = 0.238, F(2, 230) = 37.23,
p < 0.001).
A Mann–Whitney U test was conducted to examine gender differences in students’ perceptions of the use of generative AI (
Table 4). The results showed that male students reported significantly higher scores for Perceived usefulness and support of AI in learning (F1) (Mdn = 3.70 for males, 3.40 for females; U = 3163.50, z = −2.41,
p = 0.016). In contrast, female students reported significantly higher scores for Privacy and data protection in the use of AI (F5) (Mdn = 2.00 for females, 3.00 for males; U = 2766.00, z = −3.49,
p < 0.001) and for the overall SAAIHE scale (Mdn = 3.26 for females, 3.44 for males; U = 3023.00, z = −2.76,
p = 0.006). No statistically significant gender differences were found for Integration of AI into the study routine (F2) (Mdn = 3.43 for females, 3.71 for males; U = 3369.00, z = −1.89,
p = 0.058), Ethics and responsibility in the use of AI (F3) (Mdn = 4.00 for females, 4.25 for males; U = 3839.50, z = −0.71,
p = 0.479) or Concerns and doubts about the use of AI (F4) (Mdn = 2.50 for females, 2.75 for males; U = 3482.50, z = −1.61,
p = 0.107).
Furthermore, a Kruskal–Wallis H test was conducted to investigate whether students’ perceptions of the different aspects of generative AI in education differ according to how often and how long they have used AI tools in their studies, as well as their self-rated skill level and confidence in using such tools independently (
Table 5).
The results showed statistically significant differences between the groups with different frequencies of AI use in the following factors: Perceived usefulness and support of AI in learning (F1) (H(4) = 54.05, p < 0.001), Integration of AI into the study routine (F2) (H(4) = 83.03, p < 0.001), Concerns and doubts about the use of AI (F4) (H(4) = 10.17, p = 0.038) and overall attitude towards AI (SAAIHE) (H(4) = 66.49, p < 0.001). No significant differences were found for Ethics and responsibility in the use of AI (F3) or Privacy and data protection in the use of AI (F5) (p > 0.05). Post hoc comparisons using the Dwass–Steel–Critchlow–Fligner test showed that students who used the AI tools more frequently (daily or weekly) reported significantly more positive perceptions than those who rarely or never used them (p-values between <0.001 and 0.043).
With regard to the duration of use of AI tools, the results showed significant differences in students’ perceptions of the Perceived usefulness and support of AI in learning (F1) (H(3) = 18.01, p < 0.001), Integration of AI into the study routine (F2) (H(3) = 18.37, p < 0.001) and the SAAIHE scale (H(3) = 17.03, p < 0.001). In addition, the post hoc test showed that students who had been using AI tools for more than a year expressed significantly more positive perceptions in all three constructs than those who had been using AI for less than 3 months (p < 0.05). In particular, for the Perceived usefulness and support of AI in learning (F1), students with more than 1 year of experience had significantly higher mean ranks than those with less than 3 months of experience (p = 0.044) and those with 3–6 months of experience (p < 0.001). When Integrating AI into the study routine (F2), the participants with the longest period of use also gave more positive ratings than the group with a period of use of <3 months (p < 0.001) and those who had used AI for 6–12 months (p = 0.049). On the SAAIHE scale, students with more than 1 year of AI use also scored significantly higher than those with less than 3 months of experience (p < 0.001). No significant differences were found for the other subscales (Ethics and responsibility in the use of AI (F3), Concerns and doubts about the use of AI (F4) and Privacy and data protection in the use of AI (F5)).
Significant group differences were found in the Perceived usefulness and support of AI in learning (F1) (H(3) = 26.85, p < 0.001), the Integration of AI into the study routine (F2) (H(3) = 49.03, p < 0.001), Concerns and doubts about the use of AI (F4) (H(3) = 22.57, p < 0.001), Privacy and data protection in the use of AI (F5) (H(3) = 9.52, p = 0.023) and the Student Attitudes Toward Artificial Intelligence in Higher Education (SAAIHE) scale (H(3) = 42.11, p < 0.001). No significant difference was found for Ethics and responsibility in the use of AI (F3) (H(3) = 4.27, p = 0.234). Post hoc comparisons showed that students with lower self-rated skills reported significantly lower scores for Perceived usefulness and support of AI in learning (F1), Integration of AI into the study routine (F2) and overall positive attitude towards AI (SAAIHE) than more advanced users (p-values between <0.001 and 0.004). Students with lower self-rated skills also expressed significantly more Concerns and doubts about the use of AI (F4) than intermediate and advanced users (p = 0.009 to 0.032) and showed a lower perception of Privacy and data protection when using AI (F5) compared to experienced users (p = 0.038).
Significant differences were observed for all perception factors and the SAAIHE scale based on students’ confidence in the autonomous use of AI tools for academic purposes (all p < 0.05). Post hoc comparisons showed that students who reported being “very confident” or “fairly confident” scored significantly higher in Perceived usefulness and support of AI in learning (F1) and Integration of AI into the study routine (F2) than those who were “somewhat uncertain” or “not at all confident” (p-values < 0.001). The most significant differences were observed between the “fairly confident” and “not at all confident” groups, particularly for the SAAIHE scale (p < 0.001), Perceived usefulness and support of AI in learning (F1) (p < 0.001) and the Integration of AI into the study routine (F2) (p < 0.001). In addition, significantly fewer concerns and doubts about the use of AI (F4) were reported by students with higher levels of confidence (e.g., “neutral” vs. “not at all confident”, p = 0.009), suggesting that greater confidence is associated with lower concerns about the use of AI.
A Kruskal–Wallis test also showed a significant difference in Ethics and responsibility in the use of AI (F3) across years of study, H(2) = 8.32, p = 0.016. The mean ranking was 109.87 for first-year students, 121.73 for second-year students and 90.75 for third-year students. The post hoc test revealed that second-year students (M = 121.73) reported significantly higher mean scores than third-year students (M = 90.75), p = 0.016. No other pairwise comparisons were statistically significant. For all other factors and the SAAIHE scale, there were no significant differences in students’ perceptions between years of study.
4. Discussion
This study explored the perceptions and challenges of health sciences students regarding the use of AI in higher education, focussing on the responsible and sustainable integration of new technologies into academic practice, guided by a five-factor model. With the increasing use of AI technologies in healthcare and academia, it is crucial to understand students’ attitudes, usage behaviours and ethical considerations. The SAAIHE scale revealed five key factors: perceived usefulness and support, integration into everyday study, ethics and responsibility, concerns and doubts and privacy and data protection.
Overall, the results indicate a moderately to strongly positive perception of generative AI. Students saw AI as a valuable tool for education, particularly in terms of increasing efficiency, improving academic writing and providing rapid feedback. This perception was reflected in significantly higher median scores for the factor perceived usefulness and support. These findings are consistent with those of Chan [
11], who demonstrated that AI-enhanced writing facilitates deeper engagement and critical reflection in nursing education, as well as those of Zhou, Zhang, and Chan [
2], who reported that AI helps students manage time and improve productivity. This is also consistent with the recent study by Duran [
21], which shows that students in different educational systems are optimistic about the time-saving benefits of AI despite ongoing ethical concerns. Our regression analysis also confirmed that frequency and duration of AI use, as well as self-rated ability and confidence, significantly predicted perceived usefulness. Recent international studies, such as Yusuf [
17], reaffirm the importance of structured institutional policies to support meaningful integration of AI in education.
The second factor, integration into the study routine, was also rated positively, especially by students who had habitually used AI for tasks such as summarising, brainstorming and proofreading. These findings are consistent with previous studies showing that routine integration of AI promotes independent learning and deepens understanding of content [
2,
4]. However, students who reported lower levels of integration often reported a lack of formal instruction and institutional guidance, mirroring Summers et al. [
15], who identified digital readiness and infrastructure as barriers to equitable AI adoption.
Ethics and responsibility emerged as a highly rated factor with the strongest effect size among all factors. Students recognised the importance of ethical use of AI. Many expressed concerns about academic integrity and the legitimacy of using AI in examinations. These concerns reflect broader debates in the literature. Chan and Hu [
36] and Sullivan, Hall, and Morrison [
34] pointed out that uncritical use of AI can undermine critical thinking and jeopardise pedagogical integrity. The call for formal training in the ethical use of AI is well supported by previous recommendations to integrate AI skills and digital ethics into nursing curricula [
39,
44]. Furthermore, Fine Licht [
25] warns that simplistic ethical concepts can obscure the pedagogical complexity of integrating GenAI and calls for more nuanced training approaches, a realisation reflected in our students’ feedback.
The fourth factor, concerns and doubts about AI, focussed on students’ scepticism about the accuracy, reliability and transparency of AI-generated content. This finding is consistent with the findings of Zhou, Zhang, and Chan [
2], who found that students often doubt the veracity of AI results and fear being misled by “hallucinated” information. Crompton and Burke [
37] are similarly cautious, urging educators to promote critical engagement and fact checking, especially in health-related fields where misinformation can have serious consequences. Similarly, Lünich et al. [
24] found that students’ AI risk perceptions are characterised not only by technical knowledge but also by perceived autonomy and institutional trust.
Although the privacy and data protection factor has the lowest score, it still shows a significant rejection of negative perceptions, indicating trust mixed with caution. Students were concerned about the misuse of their personal and academic data, especially when using freely available AI tools. These concerns were also expressed by Summers et al. [
15] and Idroes et al. [
30], who emphasise the need for transparent data governance in educational technologies. Institutional responsibility for establishing clear guidelines for data use is crucial, especially in health disciplines where confidentiality is fundamental.
This study also found that male students perceived AI as more useful and expressed fewer concerns about its ethical implications and privacy risks than their female peers. This pattern is consistent with broader research findings showing gender differences in digital confidence and technology use, particularly in STEM and health education. Zhou, Zhang, and Chan [
2] observed similar results in their mixed-methods study, finding that male students often reported higher levels of confidence and familiarity with generative AI tools and were more likely to experiment with advanced features. These differences could be due to underlying differences in self-efficacy, digital experience or perceived control over the technology, all of which influence attitudes towards new educational tools [
15,
45].
In addition, our results showed that students with higher self-rated AI skills and greater confidence perceived AI as significantly more useful and expressed less doubt about its integration into academic work. This is consistent with previous findings that technological familiarity reduces anxiety and increases perceived benefits, especially when learners feel they can use the tools independently [
46]. Familiarity has also been associated with higher engagement in self-directed learning, where learners use AI effectively for personalised support and timely feedback [
47]. In this context, the findings emphasise the importance of promoting digital literacy and providing hands-on experiences with AI, particularly for students who may initially feel less confident in using these tools.
One of the main strengths of this study lies in the comprehensive validation and application of the SAAIHE scale. The psychometric properties of the scale were robust, with high internal consistency and a five-factor structure confirmed by exploratory and confirmatory factor analyses. The sample size was sufficiently large and diverse and included students from different health-related degree programmes and different years of study, which increases the generalisability of the results in similar educational contexts. Another strength is the integration of both descriptive and inferential analyses, including gender comparisons, regression modelling and non-parametric tests. This multi-faceted approach enabled a deeper exploration of how experience, confidence and demographic factors influence students’ attitudes towards AI use. However, beyond its methodological strengths, this study also offers valuable insights into how generative AI can be meaningfully and sustainably integrated into health science education to promote both academic success and ethical development.
Given the dynamic and rapidly evolving nature of generative AI tools, it is important that future research goes beyond static assessments. Longitudinal follow-up studies could provide a more nuanced understanding of how students’ perceptions, competencies and usage behaviours evolve over time, particularly as institutional policies and AI capabilities change. In addition, mixed methods that combine survey data with qualitative interviews or focus groups would allow for deeper exploration of the contextual and affective dimensions of AI adoption. Such designs can capture changing attitudes and uncover the educational, ethical and social complexities of AI integration that are not fully accessible through cross-sectional self-report data alone.
It is important that these findings contribute to international conversations about educational theory and practice by providing insights from a Central European context that is underrepresented in AI and health education research. Our emphasis on the interplay between ethical literacy, gendered digital confidence and sustainability further enhances the educational relevance of our study.
In line with the principles of sustainable education and the broader Sustainable Development Goals, in particular SDG 4 (Quality Education), SDG 5 (Gender Equality) and SDG 10 (Reduced Inequalities), the responsible use of AI must prioritise equity, inclusivity and ethical awareness. The findings of this study emphasise the need for targeted pedagogical strategies that bridge gender gaps in digital confidence and promote critical AI skills among all students. Institutions play a key role in ensuring that AI tools are not only accessible but also used in ways that support ethical thinking, academic integrity and diverse learning needs. To support this goal, higher education institutions can foster resilient and inclusive learning ecosystems that respond to technological change and reflect global development priorities by embedding ethical and equitable AI practices in the curriculum.
4.1. Study Limitation
However, the study is not without limitations. Although the cross-sectional design captures perceptions at a specific point in time, it limits causal conclusions and prevents the observation of changes in attitudes over time [
33]. The use of self-report leads to potential biases, including social desirability and over- or underestimation of AI skills. In particular, the possibility of social desirability bias may have led some respondents to report a more positive attitude or greater competence than they actually possessed, especially in a context where the use of AI is increasingly valorised. Furthermore, students may display digital overconfidence and overestimate their skills due to their superficial familiarity with generative AI tools, which could distort perceived skill levels. The random sampling strategy could also lead to selection bias, especially if students who are more digitally literate are more likely to participate. In addition, the study only students from a single university. As the healthcare disciplines represented are diverse, the results may not fully reflect the perspectives of other cultural or institutional settings where the adoption of AI is at different stages. A limitation of this study is also the use of the same sample for both the exploratory and confirmatory factor analyses. Although a pilot EFA was conducted on a subsample, future studies should validate the structure using a separate, independent sample. Finally, the rapid development of generative AI means that students’ experiences and perceptions may change within a relatively short period of time.
To mitigate these limitations in future research, we recommend the inclusion of triangulated data sources, such as behavioural or observational measures (e.g., usage tracking, scenario-based assessments), to supplement self-reported responses. Longitudinal studies or experimental designs may also help to better capture how students’ attitudes and behaviours evolve over time and in response to specific interventions.
4.2. Implications
The results of this study have several practical implications for curriculum development, institutional policy and faculty training in health science education. First, the strong correlation between familiarity with AI, skills and perceived usefulness suggests that early exposure to AI and structured AI skills training are essential. Institutions should consider integrating AI skill building modules into degree programmes to help students develop confidence and competence in the critical and ethical use of these tools. Second, the data emphasise the need for formal guidance on the ethical use of AI, particularly with regard to academic integrity, data privacy and responsible sourcing. Health sciences faculty and educators should work together to develop clear guidelines and teaching guides that demystify the appropriate or inappropriate use of AI in assessments, group work and clinical simulations.
As male students and students with greater confidence perceive less risk, educators need to utilise inclusive teaching strategies that support all students, especially those who feel less confident or digitally prepared. Targeted workshops, peer mentoring and supportive AI assignments can help bridge the gap between digital confidence and equality. Given the expressed caution around misinformation and data privacy, institutions should promote critical digital literacy and enable students to recognise AI-generated inaccuracies and understand the implications of data sharing when using public platforms. Working with IT and legal departments to provide briefings or tutorials on the data practices of AI tools can improve transparency and trust.