Analyzing the Effect of COVID-19 on Education by Processing Users’ Sentiments

: COVID-19 infection has been a major topic of discussion on social media platforms since its pandemic outbreak in the year 2020. From daily activities to direct health consequences, COVID-19 has undeniably affected lives signiﬁcantly. In this paper, we especially analyze the effect of COVID-19 on education by examining social media statements made via Twitter. We ﬁrst propose a lexicon related to education. Then, based on the proposed dictionary, we automatically extract the education-related tweets and also the educational parameters of learning and assessment. Afterwards, by analyzing the content of the tweets, we determine the location of each tweet. Then the sentiments of the tweets are analyzed and examined to extract the frequency trends of positive and negative tweets for the whole world, and especially for countries with a signiﬁcant share of COVID-19 cases. According to the analysis of the trends, individuals were globally concerned about education after the COVID-19 outbreak. By comparing between the years 2020 and 2021, we discovered that due to the sudden shift from traditional to electronic education, people were signiﬁcantly more concerned about education within the ﬁrst year of the pandemic. However, these concerns decreased in 2021. The proposed methodology was evaluated using quantitative performance metrics, such as the F1-score, precision, and recall.


Introduction
COVID-19 was first identified in late 2019, and it soon spread worldwide. Due to its rapid prevalence, the virus has become one of the most often discussed issues in the subsequent years. It has had a significant impact on several facets of people's lives, including health [1], tourism [2], education [3], and the economy [4].
As an attempt to limit the spread of the COVID-19 pandemic, people worldwide were home quarantined. Following these quarantines and safety protocols was the declaration of COVID-19 as a global pandemic by the World Health Organization (WHO) [5]. Establishments such as restaurants, schools, and other learning facilities were closed. As a result, teaching and learning at various levels of education were transitioned from the traditional face-to-face method to online-based education [3]. Devices connected to the Internet, such as laptops, mobile phones, and tablets are a major part of online learning or E-learning [3]. The closure of schools and universities, along with the change of the education system from traditional to online education, doubled people's concerns about this issue.
Twitter, as one of the most popular social media, is a good resource for exploring the public's opinions and extracting information [6]. People actively shared their concerns about every topic, and especially education, on Twitter. The analysis of these tweets can contribute to understanding the feedback and response of people to the shift caused by COVID-19 on education. One can read and manually analyze tweets to investigate 2 of 12 people's experiences about changes in the form of education, assessment, and virtual exams. However, nowadays, artificial intelligence and, more specifically, natural language processing techniques can be used to analyze text automatically. The first step towards this approach is to conduct text mining in order to segregate the tweets related to education, particularly to learning and assessment parameters. Traditionally, keywords were used to extract tweets; however, this method produces many errors in identifying related tweets.
This study reports an examination and assessment of the worldwide response to the effect of COVID-19 on education by processing user tweets. Using the Oxford Dictionary, we propose a lexicon of education-related words in order to identify educational tweets from a huge dataset of COVID-19-related tweets. Subsequently, by proposing lexicon-based methods, we extract the educational tweets, as well as the tweets related to the educational parameters of learning and assessment, from the dataset of tweets related to COVID-19. To identify the location of tweets we use a content analysis method [4,7]. In this method, a geographic database of place names is used to determine the location of the tweets. There are various methods, such as geotagging, for determining the location of tweets. However, in some cases, these methods may specify the location of the tweets incorrectly.
In order to get the opinions and thoughts of Twitter users about COVID-19 and education, we perform sentiment analysis for related tweets. In order to analyze the sentiments of the collected tweets, we use the language-based model of RoBERTa [8] and classify the tweets into two categories: positive and negative tweets. Then we analyze the frequency trends of all tweets, positive tweets and negative tweets, for the whole world, as well as for ten chosen countries. We also analyze the frequency trends of the tweets related to each of the two educational parameters of learning and assessment for the whole world separately. In the last step, we analyze the frequency trends of educational tweets for 2020 and 2021 (the period from 15 August to 15 September). In summary, the contribution of this research includes the following: - The identification of educational tweets by proposing a lexicon-based method. - The identification of the tweets related to the educational parameters of learning and assessment by proposing lexicon-based methods. -Extracting and analyzing sentiment trends of education.
The next sections of the article are organized as follows; in Section 2, the relevant works are reviewed; Section 3 introduces the proposed method; Section 4 details the analyses and findings; and lastly, Section 5 presents the concluding remarks.

Related Work
In the era of COVID-19, several studies [9] with different objectives have analyzed related tweets. Sentiment analysis is one of the most important processes that can be performed on tweets related to COVID-19 to extract people's thoughts and opinions regarding the issue. Several studies have focused on analyzing the sentiments of tweets related to COVID-19, including the classification of tweet sentiments into ten categories (positive, negative, anger, anticipation, disgust, fear, happiness, sadness, surprise, and trust) [10] or three categories (positive, negative, and neutral) [11][12][13]. Similarly, another study analyzed tweets related to Omicron SARS-CoV-2 and categorized the sentiment of the tweets into five categories ("Neutral", "Great", "Good", "Neutral", "Bad", and "Horrible") [14]. The top 10 topics in the English and Portuguese tweets of the United States and Brazil have also been identified and analyzed [15]. In previous studies, we analyzed users' sentiments from different countries in the first three months of the outbreak [6] and also investigated the impact of the pandemic on the economy [4].
In order to investigate the effects of COVID-19 on education through the processing of users' tweets, several studies have collected and analyzed related tweets. In order to analyze the Australian people's views on home study during the COVID-19 epidemic, 10421 tweets were collected over three weeks, and their sentiments were classified into six categories (positive, negative, sense of humor, teacher appreciation, government/politician feedback, and compliments) [16]. Similarly, Indonesians' tweets about online learning were collected in October 2020, and their sentiments were analyzed in three categories (positive, negative, and neutral) [17]. Another study [18] collected and examined 17155 related tweets to analyze sentiments and topics in tweets related to online education during COVID-19. Table 1 summarizes previous works. Previous studies have processed a small number of tweets over a short period of time. Approximating the users' response about the impact of COVID-19 on education requires longer periods of time and an analysis of a much larger number of tweets. For this reason, this article analyzes 15 million tweets posted in 2020 and 2021. The more tweets processed, the better and more accurate the result will be. Moreover, sentiment analysis is separately performed for ten countries to enable the comparison of users' attitudes across different countries. Finally, unlike previous studies, which were based on geotagging, we use a dictionary-based method to tag the location of tweets so that the location of tweets can be inferred more accurately.

Data Extraction and Analysis
A three-step method is used to analyze the effect of COVID-19 on education through the processing of users' tweets ( Figure 1). In the first step, we pre-process a large dataset of tweets related to COVID-19 [19] and then determine their location. In the second step, we thematically extract the tweets using a lexicon-based method. In the third step, we analyze the sentiments of the tweets and classify them into two categories of positive and negative tweets.

Data Collection
Several studies, such as Refs. [19][20][21], collected datasets of tweets related to COVID-19 at different times. The dataset provided in Ref. [20] is old and does not include tweets from 2021. In this research, a dataset was prepared with the help of weekly and daily sampling from a comprehensive dataset containing tweets about COVID-19 [19]. The sampled dataset contains more than 15 million tweets related to COVID-19 for the time periods of March to June 2020 (the beginning of the COVID-19 pandemic), 15 August to 15 September 2020 (the beginning of the 2020 school year), and 15 August to 15 September 2021 (the beginning of the 2021 academic year). As the datasets were sampled in different time periods, it shows a new degree of novelty, which reflects how people's attitude changes over time during the pandemic.

Data Collection
Several studies, such as Ref. [19][20][21], collected datasets of tweets related to C 19 at different times. The dataset provided in Ref. [20] is old and does not include from 2021. In this research, a dataset was prepared with the help of weekly an sampling from a comprehensive dataset containing tweets about COVID-19 [19]. Th pled dataset contains more than 15 million tweets related to COVID-19 for the tim ods of March to June 2020 (the beginning of the COVID-19 pandemic), 15 Augu September 2020 (the beginning of the 2020 school year), and 15 August to 15 Sep 2021 (the beginning of the 2021 academic year). As the datasets were sampled in d time periods, it shows a new degree of novelty, which reflects how people's a changes over time during the pandemic.
The lexicon-based method of our previous work [22] is used to determine the l of tweets. The method uses the GeoNames geographic database (containin 25,000,000 different place names and geographic information) to create a list o names for the countries with the highest cases of COVID-19. This list contains co and detailed information about the names of the states, provinces, and cities of each countries in question. The compiled list contains more than 7000 place names th pinpoint the location of each tweet precisely. The mentioned list is used as a Gazet in a GATE pipeline [23]. In fact, each tweet that mentions the name of a country' city, etc., is given the country's tag as its location. For example, the USA tag is assig the location for the following tweet: "Over 85% of corona cases in Michigan are located in Detroit and surrounding areas For example, if any of the words in Figure 2 are mentioned in the text of each the name of the country of Pakistan is considered as the location of that tweet. The lexicon-based method of our previous work [22] is used to determine the location of tweets. The method uses the GeoNames geographic database (containing over 25,000,000 different place names and geographic information) to create a list of place names for the countries with the highest cases of COVID-19. This list contains complete and detailed information about the names of the states, provinces, and cities of each of the countries in question. The compiled list contains more than 7000 place names that can pinpoint the location of each tweet precisely. The mentioned list is used as a Gazetteer list in a GATE pipeline [23]. In fact, each tweet that mentions the name of a country's state, city, etc., is given the country's tag as its location. For example, the USA tag is assigned as the location for the following tweet: "Over 85% of corona cases in Michigan are located in Detroit and surrounding areas".

Thematic Extraction of Tweets
To extract tweets related to education, we propose a lexicon-based method by creating a glossary of education-related words. For this purpose, we use all the vocabulary of

Thematic Extraction of Tweets
To extract tweets related to education, we propose a lexicon-based method by creating a glossary of education-related words. For this purpose, we use all the vocabulary of the Dictionary of Education on Oxford [24], which contains 1100 words. We apply the proposed lexicon to the existing database and calculate the precision and recall of the retrieved data. Then, by inspecting the output data, we remove the misleading words from the lexicon. After refining the lexicon, we ran the method and calculated the precision and recall again. We repeat this process to achieve acceptable precision and recall. Finally, the vocabulary of this lexicon has been reduced to 134 words, which are represented in Figure 3. After compiling the dictionary, tweets related to education are extracted using the proposed lexicon-based method.

Thematic Extraction of Tweets
To extract tweets related to education, we propose a lexicon-based method by creating a glossary of education-related words. For this purpose, we use all the vocabulary of the Dictionary of Education on Oxford [24], which contains 1100 words. We apply the proposed lexicon to the existing database and calculate the precision and recall of the retrieved data. Then, by inspecting the output data, we remove the misleading words from the lexicon. After refining the lexicon, we ran the method and calculated the precision and recall again. We repeat this process to achieve acceptable precision and recall. Finally, the vocabulary of this lexicon has been reduced to 134 words, which are represented in Figure  3. After compiling the dictionary, tweets related to education are extracted using the proposed lexicon-based method.  Learning and assessment have always been two important factors in education [22,25]. The importance of these two subjects, as well as the ever-growing concern about them, are more relevant nowadays following the COVID-19 pandemic and the change in teaching and learning methodologies. Likewise, we use a lexicon-based method to extract tweets related to each of these educational factors. In doing so, from the words of the education lexicon, we specify the words related to these two parameters and create a dictionary for each of them. The dictionaries related to the two parameters of learning and assessment consist of 12 and 13 words, respectively. In this way, we will have a lexicon of related words for each parameter, the vocabulary of which can be seen in Figure 4. Now, using the compiled dictionaries, the educational tweets related to each of the two parameters can be identified. ary for each of them. The dictionaries related to the two parameters of learning and assessment consist of 12 and 13 words, respectively. In this way, we will have a lexicon of related words for each parameter, the vocabulary of which can be seen in Figure 4. Now, using the compiled dictionaries, the educational tweets related to each of the two parameters can be identified.

Sentiment Analysis
Sentiment Analysis (SA) or Opinion Mining (OM) is defined to be the "computational study of people's opinions, attitudes, and emotions toward an entity. The entity can represent individuals, events, or topics" [26]. We perform the sentiment analysis of the tweets using the sentiment classification model of RoBERTa [3]. This model is one of the recent models of sentiment analysis and improves BERT5. In fact, the reason for choosing it for sentiment analysis is its excellent performance, high speed, and accuracy [27].
Three labeled datasets of Stanford Sentiment treebank (67,300 samples) [28], SemEval 2015 Task 10 (6800 samples) [28], and SemEval 2015 Task 11 (3500 samples) [29] have been used to train the sentiment classification model. Since three datasets are used to train the model, a separate classifier must be considered for each. The input and output of this model are the text of a tweet (string) and the representation of the tokens of h t ( = 1, …, ), respectively. Each of the three vector representation classifiers receive a tweet called H as input, which is calculated through the following equation: In which In the above relations, w att is considered as attention weight and h t as bias.
The final tag of the tweet in the testing phase is determined by the majority vote of the three classifiers. In short, the RoBERTa model tags the tweets with positive content as one and tags the tweets with negative content as zero. As a result, the sentiments of the tweets are categorized into positive and negative groups.
After identifying the sentiments of the tweets using the language-based sentiment

Sentiment Analysis
Sentiment Analysis (SA) or Opinion Mining (OM) is defined to be the "computational study of people's opinions, attitudes, and emotions toward an entity. The entity can represent individuals, events, or topics" [26]. We perform the sentiment analysis of the tweets using the sentiment classification model of RoBERTa [3]. This model is one of the recent models of sentiment analysis and improves BERT5. In fact, the reason for choosing it for sentiment analysis is its excellent performance, high speed, and accuracy [27].
Three labeled datasets of Stanford Sentiment treebank (67,300 samples) [28], SemEval 2015 Task 10 (6800 samples) [28], and SemEval 2015 Task 11 (3500 samples) [29] have been used to train the sentiment classification model. Since three datasets are used to train the model, a separate classifier must be considered for each. The input and output of this model are the text of a tweet (string) and the representation of the tokens of ht (t = 1, . . . , T), respectively. Each of the three vector representation classifiers receive a tweet called H as input, which is calculated through the following equation: In which In the above relations, watt is considered as attention weight and ht as bias.
The final tag of the tweet in the testing phase is determined by the majority vote of the three classifiers. In short, the RoBERTa model tags the tweets with positive content as one and tags the tweets with negative content as zero. As a result, the sentiments of the tweets are categorized into positive and negative groups.
After identifying the sentiments of the tweets using the language-based sentiment classification model of RoBERTa, we obtain the frequency of positive tweets and negative tweets for the whole world, and for each country separately. To this end, we calculate the frequency of positive and negative tweets for the whole world and for the countries with a significant number of related tweets.

Evaluation Scheme
In this step, we assess the performance of the proposed method based on the most popular criteria, including accuracy, precision, recall, and F1 score. To describe these criteria, we consider the confusion matrix presented in Table 2: The evaluation criteria (precision, recall, and F1 score) can be calculated according to the confusion matrix as follows: In order to calculate the evaluation criteria, we first randomly collect 1000 related tweets from the entire database by sampling. We execute the proposed method on this sampled dataset. We independently manually annotate this sampled dataset as a benchmark. Then, using the above equations, the evaluation criteria for different sections of the proposed method are calculated. The results are shown in Table 3.

Discussion and Findings
To consider the variety of languages in the tweets extracted from different locations in the world, we used the GATE software to implement the lexicon-based methods. GATE is a software program capable of processing a variety of languages. It is used to develop the software components of natural language processing [30]. Using a GATE pipeline, in which the lexicon of places is the Gazetteer list, a place name is determined for each tweet. We also utilize the GATE software to extract tweets related to education and tweets related to each of the two parameters of learning and assessment.
In what follows, the results and findings of this study are presented.

Sentiment Analysis Trends
This section analyzes the trends in the number of education-related tweets in the first three months of the global outbreak. Figures 5 and 6 demonstrate the frequency of all education tweets, positive tweets, and negative tweets, as well as the official statistics of the COVID-19 cases for the whole world and for the ten chosen countries, respectively. The vertical axis on the left represents the official statistics of the patients, and the vertical axis on the right denotes the frequency of tweets.
What is evident in the figures is that over the chosen time period, as the number of patients increases, so does the number of tweets related to COVID-19 and education. Moreover, most of the related tweets posted have negative content. To clarify the reason for the negative sentiment of the related tweets, we examined the text of the tweets in the weeks with the most negative tweets. As a result, we found out that people's big concerns are often about how to graduate, exam cancelation, and school and university closure. Therefore, it can be concluded that following the outbreak of the COVID-19, people around the world were concerned about their education or that of their children. three months of the global outbreak. Figures 5 and 6 demonstrate the frequency of all education tweets, positive tweets, and negative tweets, as well as the official statistics of the COVID-19 cases for the whole world and for the ten chosen countries, respectively. The vertical axis on the left represents the official statistics of the patients, and the vertical axis on the right denotes the frequency of tweets. What is evident in the figures is that over the chosen time period, as the number of patients increases, so does the number of tweets related to COVID-19 and education. Moreover, most of the related tweets posted have negative content. To clarify the reason for the negative sentiment of the related tweets, we examined the text of the tweets in the weeks with the most negative tweets. As a result, we found out that people's big concerns are often about how to graduate, exam cancelation, and school and university closure. Therefore, it can be concluded that following the outbreak of the COVID-19, people around the world were concerned about their education or that of their children. The studied time period is the onset of the global COVID-19 pandemic, and the higher number of tweets with negative content is indicative of people's concern and anxiety.

Analyzing Tweets from the Perspective of Educational Parameters
This subsection analyzes the frequency trend of tweets related to each of the two parameters of learning and assessment. Figure 7 displays the frequency of learning and assessment tweets for the whole world.
During the onset of the COVID-19 pandemic, when education changed from traditional to electronic methods, assessment-related tweets gradually outnumbered learningrelated ones. Especially in June, the number of tweets posted about assessment has increased significantly. Examining the text of related tweets, we find that people's discussion and worry has been about how the exams were going to be held.

Comparing the Trend of Tweets in 2020 and 2021
This section compares the frequency of education-related tweets for a one-month period at the beginning of the school year (15 August to 15 September) in 2020 and 2021. Figure 8 shows the frequency of education-related tweets in 2020 and 2021.
Comparing the years 2020 and 2021, it is clear that the frequency of education tweets at the beginning of the 2020 school year has been much higher than that of 2021. This reveals that people were much more concerned about education in the first year of the global epidemic of COVID-19 and the sudden shift from traditional to electronic education. In contrast, in 2021, those concerns were almost relieved. Big Data Cogn. Comput. 2023, 7, x FOR PEER REVIEW 9 of 13 Figure 6. Frequency of all, positive, and negative tweets, and official COVID-19 cases for several countries. Figure 6. Frequency of all, positive, and negative tweets, and official COVID-19 cases for several countries. number of tweets with negative content is indicative of people's concern and anxiety.

Analyzing Tweets from the Perspective of Educational Parameters
This subsection analyzes the frequency trend of tweets related to each of the two parameters of learning and assessment. Figure 7 displays the frequency of learning and assessment tweets for the whole world. During the onset of the COVID-19 pandemic, when education changed from traditional to electronic methods, assessment-related tweets gradually outnumbered learningrelated ones. Especially in June, the number of tweets posted about assessment has increased significantly. Examining the text of related tweets, we find that people's discussion and worry has been about how the exams were going to be held.

Comparing the Trend of Tweets in 2020 and 2021
This section compares the frequency of education-related tweets for a one-month period at the beginning of the school year (15 August to 15 September) in 2020 and 2021. Figure 8 shows the frequency of education-related tweets in 2020 and 2021. Comparing the years 2020 and 2021, it is clear that the frequency of education twe at the beginning of the 2020 school year has been much higher than that of 2021. T reveals that people were much more concerned about education in the first year of

Conclusions
In this paper, we investigated the response of individuals to the effect of COVID-19 on education by extracting people's sentiments about education on social media, particularly Twitter. In doing so, we used a three-step approach: data collection, thematic extraction of tweets, and sentiment analysis, and investigated two parameters of learning and assessment. Next, we calculated the precision, recall, and F1 score so as to evaluate the extraction of education-related tweets, the extraction of learning-related tweets, and the extraction of assessment-related tweets, and obtained an F1-score of 0.721, 0.827, 0.77, respectively, which are reasonable considering the numerous features and parameters of text processing.
The results indicate that in the world, and most of the studied countries, the majority of tweets had negative content, suggesting the intense concern of people for their own education or that of their children during the epidemic. Moreover, the frequency of education tweets in 2020 (between 15 August and 15 September) has been constantly greater than the frequency of education tweets in 2021. Indeed, in the second year of the COVID-19 epidemic (the beginning of the 2021 school year) public concern and debate have largely subsided, which could be due to e-learning maturity or satisfaction with it.
As a limitation of the work done, we can mention the processing and analysis of only English-language tweets. Besides, location extraction might not work for all tweets, especially when there is not any location name in the text, or the name is not known to our lexicon. Other limitations include the lack of enough tweets for many countries. In addition, in this research, the analysis of the desired tweets has been done in the case of only ten countries.
To more accurately examine the reasons for negative and positive sentiments at different times, we can perform topic modeling on negative and positive tweets. As a result, the negative and positive topics discussed can be identified more comprehensively. As another future direction, the countries in question can be clustered with the help of their sentiment analysis trends by applying the clustering methods for time series data.

Data Availability Statement:
The dataset generated during the current study is available from the corresponding author upon reasonable request.

Conflicts of Interest:
The authors declare no conflict of interest.