Teaching and Learning during the COVID-19 Pandemic: A Topic Modeling Study

: The coronavirus disease 2019 (COVID-19) pandemic caused signiﬁcant disruption to teaching and learning activities at all levels. Faculty, students, institutions, and parents have had to rapidly adapt and adopt measures to make the best use of available resources, tools and teaching strategies. While much of the online teaching pedagogies have been theoretically and practically explored to a limited extent, the scale at which these were deployed was unprecedented. This has led a large number of researchers to share challenges, solutions and knowledge gleaned during this period. The main aim of this work was to thematically model the literature related to teaching and learning during, and about, COVID-19. Abstracts and metadata of literature were extracted from Scopus, and topic modeling was used to identify the key research themes. The research encompassed diverse scientiﬁc disciplines, including social sciences, computer science, and life sciences, as well as learnings in support systems, including libraries, information technology, and mental health. The following six key themes were identiﬁed: (i) the impact of COVID-19 on higher education institutions, and challenges faced by these institutions; (ii) the use of various tools and teaching strategies employed by these institutions; (iii) the teaching and learning experience of schools and school teachers; (iv) the impact of COVID-19 on the training of healthcare workers; (v) the learnings about COVID-19, and treatment strategies from patients; and (vi) the mental health of students as a result of COVID-19 and e-learning. Regardless of the key themes, what stood out was the inequities in education as a result of the digital divide. This has had a huge impact not only in middle- and low-income nations, but also in several parts of the developed world. Several important lessons have been learned, which, no doubt, will be actively incorporated into teaching and learning practices and teacher training. Nonetheless, the full effect of these unprecedented educational adaptions on basic education, expert training, and mental health of all stakeholders is yet to be fully fathomed.


Introduction
The coronavirus disease 2019 (COVID- 19) pandemic has been a major disruptor of all aspects of modern life. Educational systems at all levels had to be suspended or moved online due to lockdown and social distancing rules, which aimed at limiting the spread of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) that causes COVID-19. This has significantly affected communities with limited technological infrastructure, access to high-speed internet and smart device penetration [1,2]. Despite this, educational institutions, from pre-school to higher education, in many parts of the world, had to adapt and rapidly adopt online and distance learning models that were enabled by information and communications technology (ICT). While the fundamentals and effectiveness of many of the underlying models, e.g., online learning, distance learning, distributed learning, blended learning, mobile learning, etc., tools and platforms have been studied to some extent in the past [3], the scale and scope of this adoption has been unprecedented and phenomenal. Educators had, and continue, to go to extraordinary lengths to ensure that the quality and content can be delivered in an effective way, to ensure that learning outcomes are met. This has brought forth unique opportunities to study these models, their effectiveness and adoption in a range of disciplines, from social sciences to life science, and healthcare to business management. Apart from this, learnings could also be gleaned from the perspective of the student in a broad range of aspects, including the achievement of learning outcomes and the mental health of learners. Since the beginning of the pandemic, apart from media coverage, blogs and social media posts, researchers and educators have attempted to consolidate their experiences in a large body of academic peer-reviewed publications. Several in-depth and focused reviews have also attempted to cover specific aspects that are related to teaching and learning during this period [4][5][6]. However, the breadth of this experience has not been clearly catalogued and classified. This work is an attempt at qualitatively bringing together peer-reviewed literature in the broad field of teaching and learning during the COVID-19 period, irrespective of the discipline and topics, to identify key themes in the published literature. For this, a topic modeling approach was adopted that analyzed the abstracts of teaching and learning publications during the COVID-19 period, covering 2020 and 2021, to categorize and extract key themes. COVID-19 has significantly impacted education and educational systems. To a certain extent, many of these approaches and themes are now very likely to gain widespread adoption beyond the COVID-19 period.
Topic modeling is a field of natural language processing that aims to extract themes by text mining a set of documents [7]. It finds application in a number of different fields, including image processing [8], bioinformatics [9,10], and social networks [11]. The primary aim of this study was to identify key research themes in the literature that are related to teaching and learning during, and about, COVID-19, using topic modeling, and not to comprehensively review the themes nor the representative work cited under each of the identified themes.

Methods
To identify research articles published during COVID-19 pandemic that are related to teaching and learning, the Scopus (https://www.scopus.com; Accessed on 14 May 2021) database was used. Abstracts and metadata of research articles were retrieved from Scopus for the search string "teaching COVID-19" based on a search performed on 14 May 2021. COVID-19-related educational publications first appeared in 2020. Therefore, the literature included in this study covered articles that appeared in 2020 and 2021, up to 14 May 2021. The dataset was curated to remove textual errors and missing data. Retracted papers were excluded.
For the collection of abstracts obtained, topic extraction was performed using a workflow created in KNIME 4.3.2. The document preprocessing stage consisted of several steps as shown in Figure 1. Firstly, records without abstracts were removed and commonly found section titles in abstracts (e.g., aims, objectives, purpose, conclusion, etc.) were removed. Next, each document was processed using bundled text processing nodes of KNIME that removed punctuation marks, numbers, words with less than 3 characters, stop words and markup tags. The Stanford part-of-speech (POS) tagger and Stanford lemmatizer, which are part of the Stanford Core Natural Language Processing library [12], were used to identify the POS of words in the document and to obtain the lemma of these words. Finally, low-frequency terms, occurring in less than 2% of the documents, and highfrequency terms, with an inverse document frequency less than 0.6, were excluded from further processing. The elbow method was used to identify the optimal numbers of topic clusters in the documents. In cluster analysis, the elbow method is employed to identify the optimal number of clusters in a dataset. A measure that can describe the observed variation in a dataset is plotted against the number of clusters. The number of clusters where a significant change in the measured variations is observed is often defined as the elbow. In this instance, principal component analysis was performed using the matrix of documents and keywords, and subsequently topic clusters of 1 to 25 were evaluated to compute the sum of squared errors [13]. The optimal number of clusters in the dataset was arrived at using the "elbow" observed in the plot of topic clusters versus sum of squared errors where there was the sharpest fall in the sum of squared errors. Latent Dirichlet Allocation (LDA) is an approach widely used for assigning documents to topics in topic modeling [14]. It is based on the principle that every document is made of words and higher level topics are also made of grouped words. LDA attempts to compute a score of the probability that a document could belong to any one of the identified topics. Topics were extracted from the documents using the LDA method as implemented in KNIME. Top 30 terms identified in each topic were used to a generate a word cloud, a visualization of the terms in a cluster where the size of the term is proportional to the frequency of occurrence of that word in the documents, for that topic. documents and keywords, and subsequently topic clusters of 1 to 25 were evaluated to compute the sum of squared errors [13]. The optimal number of clusters in the dataset was arrived at using the "elbow" observed in the plot of topic clusters versus sum of squared errors where there was the sharpest fall in the sum of squared errors. Latent Dirichlet Allocation (LDA) is an approach widely used for assigning documents to topics in topic modeling [14]. It is based on the principle that every document is made of words and higher level topics are also made of grouped words. LDA attempts to compute a score of the probability that a document could belong to any one of the identified topics. Topics were extracted from the documents using the LDA method as implemented in KNIME. Top 30 terms identified in each topic were used to a generate a word cloud, a visualization of the terms in a cluster where the size of the term is proportional to the frequency of occurrence of that word in the documents, for that topic. Article metadata including subject area, author supplied keywords and author countries were extracted and aggregated. Data were plotted in R 3.6.3 [15] using the data visualization packages ggplot2 3.3.3 [16], wordcloud2 0.2.1 and maps 3.3.0.

Analysis of Literature Metadata
The search for research articles based on the string "teaching COVID-19" in Scopus produced 3461 documents. The distribution of these articles based on the Scopus subject area indicated that the most number of articles were published in social science journals, followed by medical and computer science journals ( Figure 2A). It is evident that specialized science and engineering journals also published articles about teaching and learning during the COVID-19 period. Geographically, these articles originated from a number of different countries. The top 15 author affiliation countries are displayed in Figure 2B. In line with general publication trends, the United States was on top of the list, featuring 17% of all authors. Countries from North America and Western Europe featured prominently in this list. Additionally, several Asian countries, including India (5.0%), China (4.4%), Indonesia (3.0%), Malaysia (2.0%) and Saudi Arabia (1.8%), also featured in this list. Representation from South America (Brazil: 2.0%) and Africa (South Africa: 1.6%) was limited. Article metadata including subject area, author supplied keywords and author countries were extracted and aggregated. Data were plotted in R 3.6.3 [15] using the data visualization packages ggplot2 3.3.3 [16], wordcloud2 0.2.1 and maps 3.3.0.

Analysis of Literature Metadata
The search for research articles based on the string "teaching COVID-19" in Scopus produced 3461 documents. The distribution of these articles based on the Scopus subject area indicated that the most number of articles were published in social science journals, followed by medical and computer science journals ( Figure 2A). It is evident that specialized science and engineering journals also published articles about teaching and learning during the COVID-19 period. Geographically, these articles originated from a number of different countries. The top 15 author affiliation countries are displayed in Figure 2B. In line with general publication trends, the United States was on top of the list, featuring 17% of all authors. Countries from North America and Western Europe featured prominently in this list. Additionally, several Asian countries, including India (5.0%), China (4.4%), Indonesia (3.0%), Malaysia (2.0%) and Saudi Arabia (1.8%), also featured in this list. Representation from South America (Brazil: 2.0%) and Africa (South Africa: 1.6%) was limited.
The top 50 keywords provided by the authors in their articles are shown in Figure 3. Clearly, the primary keywords used by the authors relate to online or distance teaching/learning, curriculum, and pedagogy. However, specific fields, such as medical education, also feature strongly. Notably, anxiety, stress, and mental health have also been addressed in the literature. The top 50 keywords provided by the authors in their articles are shown in Figure 3. Clearly, the primary keywords used by the authors relate to online or distance teaching/learning, curriculum, and pedagogy. However, specific fields, such as medical education, also feature strongly. Notably, anxiety, stress, and mental health have also been addressed in the literature.

Identification of Topics and Themes
To identify the key themes in the teaching and learning literature that was published during, and about, COVID-19, topic modeling was undertaken, using the abstracts of the 3461 articles. Prior to this, the optimal number of topics to extract was determined based on the elbow method (Supplementary Figure S1). An optimal number of 6 topics was arrived at, based on the steepest fall in the sum of squared errors of the plot. Subsequently, the LDA method was used to assign a score to each document for the 6 topics, and then finally assign it to one of these. The top 30 terms identified in each of the 6 topics are provided in Table 1. The distribution of documents, based on their classification, into one of the 6 topics are provided in Figure 4A. Word clouds that were generated based on the weight on the top 30 words (Table 1)

Identification of Topics and Themes
To identify the key themes in the teaching and learning literature that was published during, and about, COVID-19, topic modeling was undertaken, using the abstracts of the 3461 articles. Prior to this, the optimal number of topics to extract was determined based on the elbow method (Supplementary Figure S1). An optimal number of 6 topics was arrived at, based on the steepest fall in the sum of squared errors of the plot. Subsequently, the LDA method was used to assign a score to each document for the 6 topics, and then finally assign it to one of these. The top 30 terms identified in each of the 6 topics are provided in Table 1. The distribution of documents, based on their classification, into one of the 6 topics are provided in Figure 4A. Word clouds that were generated based on the weight on the top 30 words (Table 1) in each topic are depicted in Figure 4B-G. The 6   The largest proportion of research publications (28.2%; Figure 4A) dealt with the challenges faced by higher education institutions (HEIs), primarily universities, as evident  The largest proportion of research publications (28.2%; Figure 4A) dealt with the challenges faced by higher education institutions (HEIs), primarily universities, as evident from the prominent presence in the word cloud ( Figure 4B). This featured research covering institutions from far-flung nations, including the United Arab Emirates, Brunei, Mexico, Afghanistan, Saudi Arabia, Peru, Argentina, India, Nigeria, Ethiopia, Bangladesh, and Turkey. A large fraction of the research discussed the situation at institutions in Australasian and African countries, as indicated by Figure 5A.
At the onset of the pandemic, institutions and educators had to abruptly switch to online teaching and learning activities. In the absence of precedence and guidelines, this was a major undertaking. Early on, several articles provided assistance and guidelines on how faculty and institutions could effectively tackle this [17,18].
One of the striking conclusions was the effect of the digital divide, and how that severely hampered the delivery of even the basic form of education in several parts of middle-and low-income countries of the world [1,2]. This unprecedented situation has now brought forth significantly relevant disparities in Education 3.0, which is underpinned by virtual learning and access to the Internet, as well as the proposed Education 4.0, assisted by the fourth industrial revolution [19].
Academic libraries have often been the sanctuary of higher education students. A lot of content has now been digitized and made accessible electronically in many parts of the world, through agreements entered into by libraries with academic publishers. However, once again, the impact of access to these resources in middle-and lower-income countries has been extremely limited. Several articles featured how librarians coped with this situation, from countries ranging from Nigeria [20] to the future-ready library at the University of Wollongong [21]. Indeed, this is an opportunity to revisit how academic and scientific literature could be effectively and equitably made available.
from the prominent presence in the word cloud ( Figure 4B). This featured research covering institutions from far-flung nations, including the United Arab Emirates, Brunei, Mexico, Afghanistan, Saudi Arabia, Peru, Argentina, India, Nigeria, Ethiopia, Bangladesh, and Turkey. A large fraction of the research discussed the situation at institutions in Australasian and African countries, as indicated by Figure 5A. At the onset of the pandemic, institutions and educators had to abruptly switch to online teaching and learning activities. In the absence of precedence and guidelines, this was a major undertaking. Early on, several articles provided assistance and guidelines on how faculty and institutions could effectively tackle this [17,18].
One of the striking conclusions was the effect of the digital divide, and how that severely hampered the delivery of even the basic form of education in several parts of middle-and low-income countries of the world [1,2]. This unprecedented situation has now brought forth significantly relevant disparities in Education 3.0, which is underpinned by virtual learning and access to the Internet, as well as the proposed Education 4.0, assisted by the fourth industrial revolution [19]. The prolonged period of online and distance learning also saw research on what "post-coronial" higher education would look like [22]. Information security has always a key concern involving digital communication, and this has been significantly accentuated with the tremendous increase in information exchange online, especially in the education sector, during this period [23].

Universities Used Various Tools to Deliver Courses during This Period (Topic 1)
The second major theme related to how universities and faculty adapted and delivered course content during this period [24]. This is emphasized by the prominent presence of the word "course" in the word cloud for this topic ( Figure 4C). Most of the online and live sessions were provided through collaborative tools, including Microsoft Teams, Google Classroom, Zoom, and Blackboard Collaborate. Apart from virtual instruction sessions, faculty developed, and encouraged students to produce, videos that were relevant to theoretical topics and laboratory activities [25]. In some instances, these videos were also supplemented with augmented reality, to provide an immersive experience to the students [26].
While theoretical material could potentially be delivered effectively using virtual sessions, delivering laboratory sessions remotely is quite challenging. Practical sessions are hands-on by definition and complement the theoretical material. Faculty dealt with this in a number of ways, including live demonstrations, recorded videos of sessions, simulations, and animations [27]. Also, 3D printing was reported to have been used for specific installations [28]. In some instances where fieldwork is crucial, for example ecology, students were encouraged to perform independent field work and contribute via online science platforms [29]. Interestingly, this work reported that student experience was at par with how this would be conducted normally. In several reports, faculty shared their experience with designing and delivering lab-based courses in a number of disciplines, including general chemistry, biochemistry, organic chemistry, thermal engineering, physiology, molecular biology, marine invertebrates, histopathology, etc. Interestingly, chemistry-based lab courses feature prominently in several reports, which is also evident in the word cloud for this topic ( Figure 4C).
Hand-in-hand with delivering content, the assessments and evaluation of learning outcomes also raise several difficulties. Designing and developing effective online evaluation mechanisms had become an urgent priority. Gamification has been deployed as a potential strategy to engage students and, at times, to evaluate [30]. There are contradicting reports related to how students performed during this period when compared to regular assessments [31,32]. Reference to specific countries is low in this topic, as indicated by the word cloud in Figure 5B, with a few reports referring to experiences from Asia and Europe.

Teaching and Learning Experience of Schools and School Teachers (Topic 6)
This set of reports relate to the experience of teachers in a school setting. This is emphasized by the keywords "teacher" and "school" that dominate the word cloud in Figure 4D, and is the largest among the word clouds of all the topics. A number of reports highlighted the challenges, steep learning curve and, thus, the tremendous effort that school teachers had to put in to prepare themselves to deliver courses at a distance, using modern ICT [33][34][35]. Traditional teacher training may not necessarily cover these, and many schools were ill-equipped or ill-prepared to deal with the situation [36]. However, the vast majority of reports share how these were overcome, and recommend the incorporation of such tools and practices in future teacher training programs [37,38].
Once again, aligning with the challenge faced by HEIs, the digital divide and disparities in the access to education have been brought forth in this area as well [39]. Upon reflection, it appears that the integration of ICT in developed countries has not reached levels as envisioned by UNESCO and OECD [40].
Remote schooling has had an impact on the mental health of all stakeholders, including teachers, parents, and students too [41]. This is particularly demanding in younger students, and children with special needs [42]. Parental engagement was reported to be key in effectively imparting remote learning at the school level [36].
Apart from challenges that were faced by general schooling, a number of reports discussed the impact on the teaching of non-native foreign languages, including English, Malay, Chinese, Portuguese, Russian, etc. [43][44][45][46]. Surprisingly, the literature under this theme is dominated by case studies from several European countries, and a few isolated reports from Asian countries. Thus, the true impact on school education in middle-and low-income countries is yet to be fathomed and reported.

How COVID-19 Impacted Medical and Nursing Training (Topic 5)
Healthcare is a vital pillar of society. Well-trained healthcare professionals are, therefore, a fundamental necessity. This theme encompassed the literature covering the impact of COVID-19 on medical, pharmacy and nursing training. As evident from the word cloud ( Figure 4E), this theme is dominated by reports that are related to the training of these professionals. Much of the literature was based on the responses to surveys and questionnaires.
Residency is a key part of medical training and face-to-face access to patients is absolutely essential for this. Much of this was significantly curtailed during this period [47]. Reports that are related to this from a number of specialties, including obstetrics and gynecology, urology, ophthalmology, pediatrics, gastroenterology, dermatology, cardiology, dentistry, radiology, anesthetics, emergency medicine, etc., have been documented [48][49][50][51] from nearly all populated continents of the world ( Figure 5D). Surgical residents have also been severely impacted by the cancellation of non-essential procedures during this period [52]. Training in pharmacy, palliative care, family medicine, and public health medicine have also been affected [53][54][55]. The real impact on the training and assessments, or the absence thereof, on the residents of this period is yet to be fully comprehended [56][57][58]. Mental healthcare in middle-and low-income countries has also been hit where these may, even at normal times, not be a priority [59].
Many residency programs have had to adapt by including virtual teaching rounds, virtual readouts, videos of surgeries, simulators, and online case presentations [49,60]. Nursing staff, by virtue of their role, require close contact with patients. Their training, and access to patients, have also been restricted during this period [61].
This period also saw a strong engagement of telehealth or telemedicine tools, and strategies for remote monitoring and diagnosis in medical education [62][63][64].

Learnings about COVID-19 from Patients and Studies on Treatments (Topic 4)
The key theme of this topic was a deeper understanding of COVID-19 from patients, especially those admitted to hospitals. This is indicated by the prominent presence of "patient" in the word cloud ( Figure 4F). Though, strictly speaking, this is not related to education in the traditional sense, these reports do educate practitioners and residents about COVID-19 and SARS-CoV-2, and enables them to make informed decisions. A number of these reports covered complications, risk factors, biomarkers, and link to mortality among various demographics and individuals with underlying medical conditions [65][66][67][68]. Additionally, the effectiveness of various drug and supporting treatments, including antivirals, antibiotics, antimalarials, steroids, etc., in patients have also been reported [69][70][71][72].
Another set of studies covered the transmission dynamics of the virus among healthcare workers who are at the forefront of this pandemic [73][74][75][76]. Training healthcare workers in the effective use of personal protective equipment (PPE) to protect themselves and to prevent transmission have also been covered in the literature [77][78][79].
Reports on this theme provide specific case studies from China, several European countries, as well as a few Asian and African countries ( Figure 5E).

How COVID-19 and E-Learning Impacted Anxiety and Stress Level of Students (Topic 3)
No doubt, COVID-19 has resulted in a significant spike in the anxiety and stress levels of most people. This is also very evident in students. Reports on this theme cover literature on the effect of online learning and assessments on the stress levels of students, and, in general, their mental health. Several studies, predominantly using case studies featuring Asian nations ( Figure 5F), attempted to evaluate this. Indeed, this was one of the only themes that was dominated by researchers from institutions in Asia (Supplementary Figure S2F). Much of this, however, was gleaned from questionnaires and surveys, as evident form these words prominently featuring in the word cloud ( Figure 4G).
The impact of COVID-19 on mental health has been much talked about. Several studies, from a range of disciplines, including medical, nursing, hospitality, etc., and other HEIs, from a number of countries, have attempted to assess this in students [80][81][82][83][84][85]. Student anxiety and stress was triggered both by technical limitations, as well as concerns about assessments and the achievement of learning goals [80,82,[86][87][88][89]. Surveys also evaluated student satisfaction, related to the online course delivery. Not surprisingly, medical students were less satisfied with online modes of course delivery for clinical training, exams, and assessments in several countries [90][91][92]. Despite this, many believe that such teaching pedagogies will have a much larger role to play in the post-coronavirus period [93]. However, in general, the surveys produced a mixed bag of results. While some surveys indicated that students had a positive attitude towards remote learning, others indicated that students did not prefer online teaching to face-to-face teaching during this period [82,94,95].
A switch to online learning has significantly increased student screen time. Research has documented several eye-related problems among students during this period [96].

Conclusions and Future Perspectives
The effect of the COVID-19 pandemic has been unprecedented. Teaching and learning activities have had to be significantly altered in light of the change in circumstances. While the groundwork for the pedagogical strategies has been laid out much before the pandemic, it forced the accelerated adoption of these models on a massive scale. Researchers have also used this opportunity to share their learnings from a range of disciplines. Topic modeling effectively classified the teaching and learning literature into the following six main themes: the impact and challenges imposed by COVID-19 on universities, tools and strategies employed by HEIs to overcome these, experiences of schools and school teachers, impact on medical and nursing training, learnings about COVID-19 from patients, and, finally, how this led to anxiety and stress in students. While there are several success stories, there are also several important messages for the post-coronavirus period. Firstly, never before has the digital divide been laid bare as it has been during this period. As we talk about Education 3.0 and Education 4.0, access to even basic and reliable internet access has been hugely found wanting in many parts of the world, including developed economies. Secondly, in the past, teachers and schools have not been exposed to many of the required pedagogies, and many have had to navigate a steep learning curve to be able to provide basic lessons. The incorporation of these must, no doubt, be part of future teacher training programs, and woven into the institutional fabric. Thirdly, while a lot of the literature covered the impact and strategies used by HEIs around the world, the literature on the impact on schools and the strategies they adopted are largely limited to the developed economies of the world. Much more work needs to be conducted to fathom the true impact on schools in middle-and low-income countries, where an overwhelming majority of school-going children reside.
COVID-19 has had a profound effect on healthcare and healthcare workers, including their health and training. The true effect of this period on the education of students at all levels is yet to be fully realized. The effect on the mental health and well-being of students can also not be fully fathomed yet. Much of the existing assessments of mental health in this area are cross-sectional; thorough well-thought-out longitudinal studies could perhaps shed light on this in the future. This is particularly relevant since there is a dearth of studies on the effect on the social, mental and educational development of preschoolers and kindergartners during this period. Teachers and faculty have also had to cope with a lot of change during this period. However, the effect on stress and potential burnout needs to be closely monitored and assessed, to formulate effective policies while adopting these pedagogies.
The lessons learned during, and about, COVID-19 will continue to be published, and many of these will eventually be assimilated into mainstream teaching pedagogies once things return to the new normal.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/ 10.3390/educsci11070347/s1, Figure S1: Plot of number of clusters versus sum of squared errors obtained from principal component analysis, Figure S2: Distribution of the countries of authors of articles in each of the identified topics.
Funding: This research received no external funding.