A Systematic Review of Generative AI for Teaching and Learning Practice

The use of generative artificial intelligence (GenAI) in academia is a subjective and hotly debated topic. Currently, there are no agreed guidelines towards the usage of GenAI systems in higher education (HE) and, thus, it is still unclear how to make effective use of the technology for teaching and learning practice. This paper provides an overview of the current state of research on GenAI for teaching and learning in HE. To this end, this study conducted a systematic review of relevant studies indexed by Scopus, using the preferred reporting items for systematic reviews and meta-analyses (PRISMA) guidelines. The search criteria revealed a total of 625 research papers, of which 355 met the final inclusion criteria. The findings from the review showed the current state and the future trends in documents, citations, document sources/authors, keywords, and co-authorship. The research gaps identified suggest that while some authors have looked at understanding the detection of AI-generated text, it may be beneficial to understand how GenAI can be incorporated into supporting the educational curriculum for assessments, teaching, and learning delivery. Furthermore, there is a need for additional interdisciplinary, multidimensional studies in HE through collaboration. This will strengthen the awareness and understanding of students, tutors, and other stakeholders, which will be instrumental in formulating guidelines, frameworks, and policies for GenAI usage.


Introduction
Generative artificial intelligence (GenAI) tools have taken the world by storm, most especially, ChatGPT and now, Gemini [1].The advancement in technology has raised strands of concern in various sectors, specifically, on the assumption that technology will replace peoples' jobs.A perceived predominant sector related to such effect is higher education [2].The higher education (HE) sector contributes to every nation's economy, offering a wealth of benefits to society.HE plays a key role in enhancing social mobility, bolstering social capital, fostering political stability, reducing crime rates, promoting social unity, spurring innovation, and cultivating trust and tolerance [3].The development of such technology can be traced back to the advent of large language models (LLMs) in 2018, when BE RT was released [4].Since then, several LLM s have been released including GPT [5].GenAI tools rely on these LLM s to perform the task they are developed for.For example, ChatGPT relies on the GPT series to perform its task.LLM s are trained on a large number of parameters (data), including text and images [5][6][7].By processing a huge amount of data, LLMs learn the statistical relationships, patterns, and structure within datasets, which enables them to predict or generate relevant and meaningful content in response to user requests.Thus, they are capable of performing various complex tasks [6][7][8][9].However, concerns including hallucinations [10], bias [11], ethical and privacy concerns [12,13], accidental plagiarism [14], and academic integrity [15][16][17][18][19][20] have been raised regarding Gen AI tools; such tools have been praised for their potential benefits in relation to H E .For example, Daun et al. [2] demonstrated the use of Gen AI for teaching and learning within the context of software engineering education.They showed that GenAI tools like ChatGPT can be used to find literature, answer student questions, support code implementation, and generate exercises.Kurtz et al. [21] synthesised the literature and found out that Gen AI offers opportunities to enhance students' learning experiences by facilitating learning environments tailored to students' educational needs.Their findings further suggest that the potential use of GenAI for student performance prediction offers an opportunity for early interventions, potentially reducing student churn and dropout rates.Atlas [22] reported the current application of Gen AI in HE as automated essay scoring; personalised tutoring; research assistance; language translation; helping professors in creating their syllabus, quizzes, and exams; generating reports; and email and chatbot assistance.Pesovski et al. [23] added that Gen AI provides an opportunity for affordable and sustainable personalised learning.Other notable benefits of GenAI in HE are creative writing and brainstorming [24], support for personalised tutoring [13], support for pro-gramming code development [25], essay grading [26], and it is useful for designing science units and rubrics [27].In terms of user perceptions, Rajabi et al. [28] investigated both students' and teachers' perspectives on the integration of GenAI for teaching and learning within a post-secondary school environment using a qualitative dataset.The participants recognised the tool as an advanced search engine and emphasised that students are likely to use GenAI tools, irrespective of whether the tools are incorporated into HE courses or not.Their results showed that students and teachers have mixed perceptions about ChatGPT's usage in a post-secondary school setting.The findings by Lozano and Blanco-Fontao [29] showed that students have a positive perception of the utilisation of Gen AI tools in HE.Most importantly, their findings suggest that ChatGPT is not perceived as a threat, causing the deterioration of the educational system.Moreover, Sánchez-Ruiz et al. [30] surveyed 110 students to gather students' perspectives on the impact and usage of ChatGPT.Their results showed that students have a positive opinion of ChatGPT.However, there are concerns about gaining problem-solving and creative skills.
Based on the challenges and promise GenAI offers to HE sectors, it is worth highlighting and synthesising the literature to understand the potential use, impact, and ethical issues posed by AI tools in the context of teaching and learning.To this end, this paper aims to conduct a systematic literature review on the use of Gen AI tools in HE, provide an overview of the current state of research on GenAI for teaching and learning, and offer insights into future research directions.By doing so, this study formulates two research questions (RQ) to be answered, as follows:

•
RQ1.What is the evolutionary productivity in the field in terms of the most influential journals, most cited articles, and authors, including geographical distribution of authorship?• RQ2.What are the main trends and core themes emerging from the extant literature?
To the best of the researchers' knowledge, only limited studies were found to have conducted a systematic review of the literature on GenAI in education.The authors in [31] conducted a tertiary systematic review of AI tools in education.Sullivan et al. [32] used a systematic search to review English language newspapers and online news sources about how ChatGPT is disrupting HE across selected countries.Furthermore, Bahroun et al. [33] adopted the preferred reporting items for systematic reviews and meta-analyses (PRISMA) framework for the selection of literature on GenAI use in education.These authors reviewed a total of 207 research papers using bibliometric and content analysis to explore GenAI's transformative impact in specific educational domains, including medical education and engineering education.However, this study differs as it centres on a systematic review specifically on the use of GenAI for teaching and learning practice in HE.In addition, this paper adopts a topic modelling (TM) approach to distil information from the literature and report core themes.To conclude, relevant analysis of the data collected is performed and the main contributions of this paper are described, as follows:

•
This review provides a comprehensive overview of the current state of research on GenAI for teaching and learning in HE, this helps researchers to identify the evolutionary progression (most influential journals, articles, authors, including geographical distribution of authorship), prevailing topics, and research directions within the field;

•
This review synthesises the findings to generate insights into a holistic perspective on the potential, effectiveness, and limitations of Gen AI use for teaching and learning in HE;

•
This review identifies research gaps that require further investigation, guiding future research work.

Methodology
A systematic literature review on the use of GenAI in HE was conducted by adopting the PRISMA [34] guidelines.Scholarly articles (conference proceedings and journal papers) over the last 7 years were reviewed and analysed.A s shown in Figure 1, the study made use of key search terms related to GenAI, teaching, and learning, and H E .The keywords, as shown in Table 1, were used to extract metadata from the relevant research papers (documents) in the Scopus database.The Scopus database was used due to its document volume, reliability, the accuracy of the information, and its advantage of using rigorous original metadata to associate people, published theories, and institutions [33,35].The following subsections discuss the steps taken to achieve the data collection and analysis.The search strategy adopted keywords such as "generative artificial intelligence", "assessment", "higher education", " HE " , and "teaching and learning".Using the search expression in Table 1, the initial results generated 625 papers from the Scopus database.Next, the research papers were narrowed down to a period of 7 years (2017 to 2023).A period of 7 years was chosen because the concept of GenAI is relatively new, and although there are arguments that the wider the year range, the better the information that can be obtained from document convergence [36], from our review of the results, studies before 2017 were not relevant to the context of the study, thus 2017 was selected as the take-off point.Two of the authors reviewed the results generated independently against the inclusion and exclusion criteria and, in the case of disagreement, a third author helped to decide whether the paper met the criteria for inclusion.Using the inclusion and exclusion criteria presented in Table 2, a total of 355 papers were selected for analysis.To ensure the study's reliability, Cohen's kappa inter-rater reliability assessment was conducted.The authors assessed the quality of the studies independently to ensure that the extracted papers were relevant.This was done by two researchers (BO and KZ) .This process started by randomly selecting 20 documents from the data used for the analysis.These documents were then screened using the titles and abstracts.Having agreed on the criteria for inclusion or exclusion based on the titles and abstracts, the coding decisions of the two researchers (rater BO and rater KZ) were presented and assessed to determine the inter-rater reliability using Cohen's kappa (κ) value.In cases of disagreement, a third author helped to decide the outcome.Cohen's kappa coefficient depicts the value for the degree of consistency among the raters, that is the extent to which their measures are the same, based on the number of codes in the coding scheme and the value obtained [31].For example, a kappa value of 0.40 to 0.60 is fair, 0.61 to 0.75 is good, while a value above 75 is excellent.After the necessary computation, the kappa value of 0.659 was arrived at.That is, the inter-rater reliability value is good for the coding of the inclusion and exclusion criteria, and there is consistency among the documents used for the analysis.This helped minimise the risk of bias and improved the data quality.

Bibliometric Approach
This study employed bibliometric indicators, such as the number of publications, number of citations, top cited documents (sources and authors), co-authorship (geographical distribution), and term co-occurrence (keywords, title, and abstract).This paper used suitable software, such as Python, Power BI, and VOS viewer, to present our findings, where appropriate.

Topic Modelling Approach
This study used a TM technique to distil the current state of research on Gen AI in HE.The concept of TM is becoming popular for literature review analysis [37][38][39][40].This is because the approach provides an automated and efficient way of uncovering hidden themes [6].In this study, the researchers utilised a well-refined corpus fitted to the latent Dirichlet allocation ( L D A ) model to uncover latent themes from the research documents (abstract).L D A , proposed by Blei et al. [41], is a generative probabilistic model for topic extraction.The topic model captures the important intra-word/document statistical structure via a mixing distribution.L D A assumes words in each document are related and, thus, topic assignment strongly relies on local co-occurrence.The documents represent probability distributions over latent topics.While topics represent probability distributions over words.For the evaluation of our L D A topics, this paper employed the use of coherence scores, perplexity, and human interpretation.The coherence score indicates the data quality by comparing the semantic similarity between words in a topic.The coherence score measures how well the text aligns with human judgment and is closely correlated to human interpretation [6,42,43].The coherence score is often interpreted as the higher the score, the better the model.This implies that a high score indicates that the grouped words are sensible, relevant, and consistent.Whilst a low coherence score means the topic is vague, noisy, or irrelevant [42].This paper produced both a coherence score (cv) and a coherence score (Umass) to strengthen the model evaluation process.The coherence score (cv) calculates the probability of co-occurrence for word pairs generated using a sliding window and, thus, measures the mean cosine similarity between the word's feature vector and the topic's feature vector [43].Whilst the coherence score (Umass) measures the word pair relationship based on document co-occurrence and, thus, for every K (number) topic, words are ordered (in descending order) based on the probability of a word for a given topic [43,44].The perplexity indicates how well the model describes a document by computing the inverse log-likelihood of unseen data.Perplexity is often interpreted as the lower it is, the better the model.These metrics are considered appropriate for the performance evaluation of topic models [6].

Results and Discussion
The results of the analysis are presented in two subsections.The first (Section 3.1) provides insights into the bibliometric indicators.whilst the second (Section 3.2) presents the results from the TM.

Bibliometric Analysis Results
This section provides the results from the bibliometric analysis, as follows.

Documents by Publication Type
Figure 2 shows that the extracted documents are 72% and 28% journal articles and conference proceedings, respectively.This implies that research on AI-related fields is evolving, and additional studies are needed.

Publications per Year
From 2017 to 2023, there was a gradual increase in the number of papers published as both journal articles and conference papers over the years, with a significant jump in 2023, as shown in Figure 3.The year-on-year increase shows a growing interest in the area.However, in 2023, the significant jump in the number of publications from 38 to 273 can be explained by the launch of ChatGPT.In November 2022, OpenAI (the creators of the GPT series of LLMs) released ChatGPT to the public and, within two months of its release, it was estimated to have reached 100 million monthly active users [45].This rapid adoption led academics to explore its impact on various aspects of teaching and learning.

Conference proceedings 28%
Journal articles 72% Journal articles Conference proceedings  It is interesting to note that there is currently an equal split between education and technologyfocused journals in the ranked list.We posit that the adoption and impact of Gen AI technologies in education for teaching, learning, and assessments will continue to grow.

Citations per First Authors
The analysis identified Yogesh K. Dwivedi as the first author with the highest number of citations at 291, followed by Jurgen Rudolph with 234 citations, as shown in Figure 5.As most of these citations have only been acquired since 2023, this shows that the conversations around GenAI in education are a trending topic amongst education researchers, technology experts, and industry practitioners.

Publications/Citations per Year
The total number of documents used in this study is 355, with a total amount of citation of 2923, as shown in Table 3.Out of these documents, the chart in Figure 6 shows the distributions of the documents and the corresponding citations per year.From the chart, we observed that the documents in the year 2023 have the highest value at 273 (76.9% of the entire document).These 273 documents have 2078 citations (which is 71% of the total citations).Between 2018 and 2021, there was a steady progression of documents with citations received.However, in 2022, though the number of documents produced increased, the increase in the total number of citations was not commensurate.Nevertheless, the year 2023 ushered in extremely large documents with corresponding citations.This is not unusual, considering that GenAI, including ChatGPT, became popular in early 2023.Out of the 273 documents in the year 2023, Table 4 further depicts the 10 publications with the highest number of citations.This study produced a co-authorship visualisation map to understand co-authorship by countries.Figure 7 shows the geographical distribution according to co-authorship.For the analysis, the co-authorship country of origin was taken into consideration.Furthermore, a country is considered if it had at least three papers, which resulted in 35 countries being included for the analysis.A s shown in Figure 7, the analysis evidenced that the most relevant countries in terms of authorship relationship, based on the number of papers (n) or citations, are the United States (n = 124, citations = 1598, total link strength = 88), Australia (n = 38, citations = 767, total link strength = 34), China (n = 36, citations = 249, total link strength = 32), and the United Kingdom (n = 32, citations = 462, total link strength = 28).The dominance of the United States in terms of the number of publications in this field is not surprising, as this is consistent with the findings of previous studies [55][56][57][58][59][60].Overall, the volume of documents produced in the area of GenAI in HE sectors needs improvement, especially in the global south.

Co-Occurrence
Figure 8 shows an overlay visualisation network map for the co-occurring keywords in the different years of publication.Firstly, this study used authors' keywords to investigate co-occurrence.The analysis produced five clusters and, as shown in Figure 8, artificial intelligence (n = 141) and ChatGPT (n = 126) are the highest co-occurring keywords.Specifically, in 2022, notable co-occurring keywords are grouped into clusters 2, 3, and 5, namely generative adversarial network, AI, chatbot, deep learning, machine learning, GPT-3, natural language processing, and language model.The author's keywords indicate publications related to the application of AI/machine learning to education.Whilst in 2023 (in yellow), the co-occurring keywords (grouped into clusters 1 and 4) are academic integrity, assessment, chatbots, ChatGPT, education, ethics, generative pre-trained transformer, GPT-4, higher education, medical education, OpenAI, and prompt engineering.This is not surprising due to the emergence of ChatGPT in November 2022, when this area attracted a lot of attention in terms of research and debate, especially in the medical education domain.
To gain an in-depth understanding of keyword co-occurrence, this study performed title and abstract keyword co-occurrence analysis, as shown in Figure 9.The co-occurrence analysis (title and abstract) results yielded six clusters.The keywords grouped into clusters 3 (collaboration) and 6 (English, llms) are not informative.Cluster 2 consists of keywords such as algorithm, C N N , machine learning, and NLP, which suggest papers on algorithms and the underlying technologies that are being discussed extensively for the development of GenAI for intelligent educational technologies.This is crucial to integrate GenAI with virtual reality [61].Cluster 4 is made up of keywords such as accuracy, answer, examination, incorrect answer, medical student, medical education, performance, and prompt, which are indicative of research conducted to examine the performance accuracy of GenAI systems on assessment/examination papers [62,63].Cluster 5 consists of keywords such as experience, response, source, survey, and participants, which indicates survey research conducted so far that could be used to work towards an understanding of the perspectives of students/teachers on the use of Gen AI [64].Most notable is Cluster 1, which consists of keywords such as academia, academic integrity, critical thinking, and ethical consideration, which indicates publications related to the usage/implications of GenAI systems, linked to developing a greater understanding around academic integrity, the development of assessments, and better pedagogical practices in response to these emerging tools [10].Furthermore, we deduce from Figure 9 that there is a movement in the themes from 2022 to 2023 (in yellow) towards incorrect answers, examination, medical students, medical education, LLMs, potential impact, higher education, and role.This indicates topic trends and, therefore, emerging research themes.
In summary, we use bibliometric indicators to quantitatively assess the publication patterns, research progress, and impact of academic literature.Using VOSviewer, we performed co-occurrence analysis, which was achieved using clustering.Clustering is the grouping of objects according to their similar attributes [65].The authors [66] developed VOSviewer in 2010 [67] and demonstrated the use of VOSviewer to perform co-occurrence analysis of research publications using clustering.They adopted citation relationships as the similarity attribute to perform the cluster analysis.However, they acknowledged that the approach is limited when the period of analysis is relatively short.This is important to note considering that clustering analysis groups themes that are similar and does not explicitly uncover latent themes and, secondly, Gen AI use in the HE context is relatively new.Thus, our study extends the co-occurrence analysis with TM to uncover latent themes.This agrees with the study by D'ascenzo et al. [68].In subsequent sections, we applied the TM technique, namely L D A to complement the bibliometric analysis results, by distilling the current and future research pathways on GenAI for teaching and learning practice.

Topic Modelling Results
This section presents the L D A results.Beforehand, the researchers identified the optimal number of topics using a coherence score.To achieve this, a plot of the coherence scores (cv) against the number of topics (5-100) was produced, as shown in Figure A1 (in Appendix A).The plot shows that the model achieved optimality with a coherence score of 0.36 for 10 topics.Furthermore, we evaluated the model using coherence scores (Umass) and perplexity scores, which are reported in Table A1 (in Appendix A).Lastly, we used human interpretation, and researchers ascertained that the model performed best when the number of topics was set to 10.Therefore, Table 5 presents the topics in which the research documents were classed.Each of the topics contains 15 terms, which helped in profiling the 10 topics identified.Based on the terms and human interpretations, the topics were labelled.Furthermore, we produced a frequency plot, as shown in Figure 10, to understand the distribution of the research papers across the topics.
Image, network, method, application, study, generation, performance, data, result, accuracy, detection, approach, text, field, generate.The research documents grouped into topic 1 discussed several use cases of Ge n AI tools.For example, these research papers investigated the accuracy of content generated by GenAI, specifically for essay writing in health and computing disciplines.Furthermore, these papers investigated the effects of utilising Ge n AI in creating digital artifacts on students' understanding of AI literacy and their perception of social and ethical compliance.More specifically, a few papers emphasised the ethical implications, such as cheating.This topic made up 6.48% of the entire number of documents retrieved.The research papers grouped into topic 2 discussed the applications of Gen AI to education and research in HE.These papers were specific to disciplines such as nursing, clinical science, ophthalmology, and radiation oncology.A few studies used GenAI systems to provide academic reviews of scientific papers.In addition, a few studies discussed policies and regulations on the adoption of AI tools.Specifically, we observed a study that proposed an AI ecological education policy framework for integrating GenAI tools into HE.Furthermore, these papers highlighted the potential benefits and drawbacks of integrating technology into education, providing insight into both the opportunities and challenges it presents.This topic made up 11.27% of the entire number of documents retrieved.

•
Topic 3: Support system (60 research papers) Topic 3 made up the highest proportion (16.91%) of research documents.The research papers in this group focused on the use of Gen AI tools as a support system.For example, Gen AI is a support system for education delivery, specifically for collaborative learning, exercise generation (to form a question bank), contract drafting, and administrative support.Other examples include the use of GenAI as a customer service (chatbot) system to support admissions to university.Furthermore, a few studies examined how Gen AI systems may affect assessments across several disciplines, including medical and engineering education.

• Topic 4: Bias and inclusion (26 research papers)
The research documents categorised into this topic discussed how Gen AI tools can be incorporated into education curricula, including teaching with GenAI across all levels of education.In addition, it was observed that the literature investigated teachers' and students' perspectives on inclusion.Furthermore, a few studies highlighted some setbacks related to Gen AI tools.More specifically, bias and gender inequality were key themes discussed as a result of responses generated by GenAI tools.This topic made up 7.32% of the entire number of documents retrieved.

•
Topic 5: Intelligent tutoring system (42 research papers) Topic 5 research documents discussed the use of Gen AI tools specific to teaching and learning practice.The research papers highlighted that Gen AI tools can serve as a learning technology to support student learning outcomes.For example, students can engage the system to develop case studies including solutions on a particular subject and, thus, can critically reflect on the case studies.Further examples include using GenAI tools to provide a personalised learning experience, engagement, real-time interactivity, and feedback.These research papers emphasised the diverse applications, implications, and perspectives surrounding the integration of Gen AI technologies into education, ranging from student modelling and feedback systems to language education research and beyond.This topic made up 11.84% of the entire number of documents retrieved and ranks third in terms of the highest proportion of documents.

•
Topic 6: Machine learning/AI applications (25 research papers) The research documents categorised in topic 6 presented the use of machine learning algorithms to develop AI applications.The papers investigated both the practical use and the challenges associated with GenAI technologies.For example, a study investigated how social work researchers can use such tools.Furthermore, another paper investigated the code generation performance of a system.This topic made up 7.05% of the entire number of documents retrieved.

•
Topic 7: Performance evaluation on exam questions (30 research papers) Research papers in this group examined the performance of Gen AI tools for several examinations, specifically for medical education.Notable areas of medical subject matters being examined were plastic surgery graduate medical education examination, orthopaedic in-training examination questions, board-based questions on the Congress of Neurological Surgeon (CNS) self-assessment neurosurgery (SANS) exam, the American Board of Orthopedic Surgery examination, a fertility-related clinical prompt, the French language version of the European Board of Ophthalmology examination, Section 1 in the fellowship of the Royal College of Surgeons (trauma and orthopaedics) examination, the Peruvian national medical licensing examination, the Japanese medical licensing examination, the Japanese national examination for pharmacists, and the European Board of Ophthalmology (EBO) examination.Furthermore, a few studies experimented with GenAI systems as assistants to help in diagnosing and providing potential treatment suggestions for glaucoma and arthrosis.The papers also investigated whether there was agreement between GenAI and humans in terms of diagnosis and treatment suggestions.In addition, a few studies experimented with Gen AI systems for teaching practice in the field of radiology.In total, the research documents in topic 7 amount to 8.46% of the entire number of documents retrieved.

•
Topic 8: GenAI for writing (28 research papers) Topic 8 consists of research documents that focus on opportunities for Gen AI tools, specifically for academic and scientific writing.Some of the contexts discussed concern medical writing and drafting learning objectives.This topic made up 7.89% of the entire number of documents retrieved.

•
Topic 9: Ethical and regulatory considerations (37 research papers) The research documents in topic 9 discussed the challenges posed by Gen AI tools in practice.The papers expressed the existing use of such tools in terms of timely responses/feedback and chatbots.However, they also discussed issues and challenges such tools can bring in regard to student learning.A few studies explored the ethical implications of using Gen AI for teaching and learning and discussed pedagogical approaches to effectively integrate such tools, while ensuring ethical use and promoting meaningful learning outcomes.More specifically, the challenges discussed were academic integrity, hallucinations, plagiarism, misleading information, critical thinking, ethical issues, and data privacy.Many studies agreed that there is a pressing need to examine the ethical implications and establish appropriate regulations.The studies discussed the critical issues related to patient care, privacy, and professional ethics within the context of medical and healthcare education, making it relevant and important for discussions.This topic made up 10.42% of the entire number of documents retrieved.

•
Topic 10: Deep learning/AI models (44 research papers) The research documents categorised in topic 10 discussed the use of deep learning algorithms to develop AI applications.A popular example was the use of AI models for predicting academic performance.This topic made up 12.39% of the entire number of documents retrieved, which is the second highest topic discussed.
Overall, our TM results indicate that topics such as the implications of GenAI, GenAI for education and research, support systems, bias and inclusion, intelligent tutoring systems, machine learning/AI applications, performance evaluation on exam questions, Gen AI for writing, ethical and regulatory considerations, and deep learning/AI models are the core themes of the research papers analysed.Our literature synthesis showed that a considerable number of studies reported the benefits of integrating GenAI tools to support personalised learning experiences, provide feedback, enhance student engagement, create learning activities, and support student learning outcomes [26].The TM findings indicate that there are several proposed GenAI education policy frameworks for integrating these systems into HE.However, the issues of bias, gender inequality, misleading information, limitations to critical thinking, data privacy, plagiarism, and ethical and academic integrity are growing issues that are well stated in the literature [10].

Conclusions
This paper aims to provide an overview of the current state and progress in research on Gen AI for teaching and learning in HE, through a systematic literature review.For this purpose, we used bibliometric indicators and TM to synthesise the literature.In response to RQ1, our findings show that more journal articles (72%) were published than conference papers (28%) in this genre.The results identified "Yogesh K. Dwivedi" as the author with the highest number of citations and the article "So what if ChatGPT wrote it?Multidisciplinary perspectives on opportunities, challenges, and implications of generative conversational AI" as the most cited research paper.Due to the emergence of GenAI tools, specifically the development of ChatGPT in November 2022 [5], our results evidenced the exponential growth in publications (in 2023) in the area of Gen AI in HE.Moreover, the analysis showed that the Journal of Applied Learning and Teaching (JALT), the International Journal of Information Management (IJIM), and the New England Journal of Medicine (NEJM) were the most cited journals, with the United States of America, Australia, China, and the United Kingdom being the leading countries in terms of authorship.The latter is consistent with the findings in previous studies [55][56][57][58][59][60].
Aside from the current status of scholarly works on Gen AI in learning and teaching in HE sectors that emerged from the types of publication (conference/journal), most cited authors/sources, and co-authorship (countries') perspectives, the findings also revealed the progression in the choice of keywords, based on the authors' keywords, as well as titles and abstracts.This invariably shed some light on the trend in the use and adoption of keywords, hence the direction of research pre-and post-GenAI (i.e., before 2022 when it was launched and afterwards).
In response to RQ2, the author's keyword analysis showed that academic integrity, assessment, chatbots, ChatGPT, education, ethics, generative pre-trained transformer, GPT-4, higher education, medical education, OpenAI, and prompt engineering are the topic trends.Similarly, the emerging themes identified in the title and abstract keyword analysis are incorrect answers, examination, medical student, medical education, LLM s, potential impact, role, and higher education.Furthermore, the TM results showed the core themes are the implications of GenAI, Gen AI for education and research, support systems, bias and inclusion, intelligent tutoring systems, machine learning/AI applications, performance evaluation on exam questions, Gen AI for writing, ethical and regulatory considerations, and deep learning/AI models.Implications, Limitations, and Recommendations for Future Work Theoretically, we evidenced that TM is a suitable technique to complement keyword co-occurrence analysis, because it provides an automated and efficient means of distilling latent themes.Furthermore, our approach demonstrates an alternative method to content/thematic analysis when reviewing the literature.In practice, our results benefit HE, stakeholders, and the research community to understand the current state of Gen AI use.For example, the key topics identified, such as intelligent tutoring system, bias, and ethical considerations, are critical areas of focus.The substantial literature on GenAI in medical education indicates its potential use across various disciplines.Academics and students must understand GenAI's limitations to leverage its strengths effectively.
It is worth stating that this paper reviews literature in the English language and, thus, it might mean that important research in other languages was left out.This limitation means our study might have missed the non-English topic trends in GenAI research.
To enhance inclusivity, it is crucial to expand representation across journals, incorporating non-English publications.Longitudinal studies are needed to monitor evolving GenAI research trends continuously.Further exploration of the ethical implications and mitigation strategies is necessary to ensure responsible usage.Stakeholder engagement, particularly feedback from students and educators, is essential for refining Gen AI tool development and aligning them with actual needs.Additionally, investigating the integration of GenAI modules into educational curricula and assessments, while considering the ethical aspects is vital for future research to ensure balanced knowledge and ethical usage.Overall, our results indicate that there is a significant amount of literature on the use of G e n AI for medical education and, at the moment, this is disproportionate to other disciplines in HE.Therefore, the following recommendations are made:

•
Furtherance to the keyword, title, and abstract analyses, significant studies that examine the performance of GenAI tools in medical and healthcare disciplines abound; such studies across disciplines are required/recommended in HE.With such multidisciplinary and interdisciplinary studies, informed decisions on agreed guidelines towards the usage of Gen AI systems in HE will emerge, hence the debate on Gen AI will be well situated;

•
This study revealed the countries with the largest number of publications, with none or low publications from developing countries.We, therefore, recommend that future publications be carried out in the area of GenAI through collaboration, especially in the global south; • Academics need to understand the issues surrounding GenAI and develop strategies that will minimise its weaknesses but enhance its opportunities.Students also need to be aware of GenAI's limitations and shortcomings in terms of its non-ethical use and the implications on critical and analytical thinking, as well as the impairment of other soft skills.With this in mind, both tutors' and students' inputs will need to be successfully incorporated into GenAI tools for pedagogical practice;

•
The development of LLM-based chatbots is growing.More recently, the development of Gemini has occurred, which is said to outperform ChatGPT-4 in most NLP tasks.This is yet to be ascertained in the HE domain.Thus, we recommend an experimental comparison of these GenAI tools for teaching and learning and assessment in terms of pedagogical practice; • To successfully incorporate GenAI tools into teaching and learning practice, there is a need for users' input and perspectives with an interdisciplinary scope.Thus, there is a need for research synthesis from students' and academic tutors' perspectives to formulate the use of GenAI tools for teaching and learning pedagogical practice; • Plagiarism detector systems like Turnitin have integrated AI content detectors into their system.However, the performance of such systems is not yet known.There is a need to examine the performance of Turnitin (and similar systems) to understand the extent to which these systems can identify AI-generated and human-written texts across several disciplines in HE; • There is a need to update the curriculum in education [69].However, there is a need to have a proper understanding of the potential impact of GenAI tools on the current curriculum.A t this stage, it is not yet known whether including modules like an "Introduction to GenAI" in the curriculum will provide a balance between knowledge, usage, and ethics;

•
To conclude, future research should be focused on interdisciplinary studies to develop guidelines for GenAI usage in HE.Experimental comparisons of advanced GenAI tools like Gemini and the performance of AI content detectors in plagiarism systems will be explored.Comparative studies should be conducted to assess the effectiveness of GenAI tools in educational settings, accurately.Updating curriculum and assessments to include GenAI topics, while assessing their impact on education, will be crucial for balanced knowledge and ethical usage.

Figure 1 .
Figure 1.A systematic literature search using the PRISMA framework.

Figure 3 .
Figure 3. Publications per year.3.1.3.Citation per Source Title Figure 4 presents the top ten most cited sources.The most cited source is the Journal of Applied Learning and Teaching with 301 citations, and this is very closely followed by the International Journal of Information Management with 291 citations.The tenth most cited source is the I E E E Journal of Biomedical and Health Informatics, with 71 citations.It is interesting to note that there is currently an equal split between education and technologyfocused journals in the ranked list.We posit that the adoption and impact of Gen AI technologies in education for teaching, learning, and assessments will continue to grow.
the following topics, as detailed below.•Topic1: Implications of GenAI (23 research papers)

Table 2 .
Inclusion and exclusion criteria.

Table 3 .
Total citations per year.
Figure 6.Publications and citations per year.