Public Opinions about Online Learning during COVID-19: A Sentiment Analysis Approach

: The aim of this study was to analyze public opinion about online learning during the COVID-19 (Coronavirus Disease 2019) pandemic. A total of 154 articles from online news and blogging websites related to online learning were extracted from Google and DuckDuckGo. The articles were extracted for 45 days, starting from the day the World Health Organization (WHO) declared COVID-19 a worldwide pandemic, 11 March 2020. For this research, we applied the dictionary-based approach of the lexicon-based method to perform sentiment analysis on the articles extracted through web scraping. We calculated the polarity and subjectivity scores of the extracted article using the TextBlob library. The results showed that over 90% of the articles are positive, and the remaining were mildly negative. In general, the blogs were more positive than the newspaper articles; however, the blogs were more opinionated compared to the news articles.


Introduction
The unprecedented closure of educational institutions due to COVID-19 resulted in over 90% of students out of school by 30 March 2020 [1]. In the 54 countries of the Commonwealth, an estimated 574 million students were out of school by 15 May 2020 [2]. During this crisis, while several governments responded by using a range of tools (radio, television, printed text, Internet) to continue supporting teaching, online learning emerged as a predominant method of teaching and learning. A global overview of interruption of education due to COVID-19 covering 31 countries showed that such measures taken exacerbated social injustice, inequality, and digital divide [3] and highlighted the need for "a pedagogy of care, affection, and empathy" [4].
Developing high-quality online courses requires skill and resources in addition to the understanding of pedagogical approaches suitable for the online environment. Interestingly, these are in short supply during the pandemic [5]. Daniel [6] mentioned that during the crisis, it is more important to focus on doing rather than fighting for perfection and trying to learn pedagogy or technology overnight. Teachers used many innovative approaches to support students during this period. In the Philippines, they used Facebook groups to build solidarity, formulate strategies, and raise donations to support students [7]. However, in practice, there are concerns about fraud and privacy issues for conducting online examination and teaching of practical courses online have become a challenge [5].
Growing focus on online learning has resulted in exponential growth in online courses, mostly offered through video conference solutions, and has some challenges. There are concerns about the digital divide in many countries [8], and many are not enthusiastic about online courses [5] during the pandemic. However, the EdTech industry has treated the crisis as a business opportunity [9]. Many EdTech companies and educational businesses, and technology philanthropists have invested in taking new or existing products to scale [10]. This has proliferated the commercialization of education, creating a "seller's market" [11]. Some researchers also see the use of online learning during the COVID-19 as a global EdTech experiment [12,13] that will lead to a better understanding about the efficacy of online learning and the transformative power of digital learning. On the other hand, Zhao [14] recommends reimagination of what, where, and how of learning during COVID-19 not just as a reaction, but as proactive thinking to meet the demands of the 4th Industrial Revolution.
Popular media, especially "mainstream media coverage is an important influence on the governmental agenda" [15] (p. 581). Of late, new media (blogs) has emerged as a platform for expression of public opinion setting popular agenda [16], and social media (micro-blogging) play an important role in setting policy agenda after a catastrophic event [17]. Medhat et al. [18] mentioned that people express their feelings or opinions about a specific topic or product in online news articles and micro-blogs or blogs, and these sources of data still need more in-depth analysis. News articles and blogs are abundant sources of information [19] that can be analyzed to evaluate and strengthen online learning. Considering the rapid proliferation of the online learning and a large number of items that regularly appear in the digital newspapers and blogs, we considered using sentiment analysis (SA), a widely used technique to study people's opinions [20] to understand the discourse around online learning. Recognizing the importance of the practical urgency and the need to be critical practitioners at the time of the pandemic [21], this study attempted to analyze and understand people's opinion around the use and effectiveness of online learning, as reported in digital media from 11 March 2020 to 26 April 2020. Therefore, the dataset used in this study is limited but presents a unique approach to study people's opinions about online learning during the COVID-19 pandemic. The following research questions guided this study: RQ1: Among the extracted articles, were the articles positive, negative, or neutral? RQ2: Were the articles written fact-based or opinion-based? RQ3: Is there a significant difference in people's sentiments between the news articles and blogs?

Web Scraping
Web Scraping is a technique to extract large amounts of data from websites to adapt to various scenarios [22]. Web scrapping is mainly used to extract unstructured data from the websites and transform it into structured data. The extracted data is saved to a local file in the computer or a database in a spreadsheet format for data pre-processing and analytics. The commonly used web scraping techniques are Web-scraping Software, DOM parsing, Computer vision webpage analyzers, HTTP programming, and HTML parsing [23]. Web scraping could easily be performed using some libraries like Beautifulsoup (https: //pypi.org/project/beautifulsoup4/) (accessed on 12 May 2020) and Request (https:// requests.readthedocs.io/en/master/) (accessed on 12 May 2020) in Python language. This method is cost-effective and efficient as it saves several thousands of hours of traditional copy and paste, gives access to clean and well-processed data without much effort, and provides real-time data with just a handful line of code [24].
Many researchers have been employing web scraping methods in different domains. For example, Haddaway [25] used Web-scraping Software to search and create a database of grey literature like journals, reports, working papers, government documents, white papers, and evaluations, which are otherwise difficult to extract. Herrmann and Hoyden [26] applied web scrapping to extract research papers and articles related to marketing research and marketing science. In another study, Chen et al. [27] employed web scraping to extract users' post from social media like Facebook, Weibo, Twitter, and Baidu. They scrapped 18,809,276 original posts from 5,613,807 users that would have been difficult using traditional methods.

Sentiment Analysis
"Sentiment Analysis (SA) or Opinion Mining (OM) is the computational study of people's opinions, attitudes, and emotions toward an entity" [18] (p. 1093). SA can be performed at three levels: (1) document-level, classify an opinion document as expressing a positive or negative opinion or sentiment, (2) sentence-level, classify sentiment expressed in each sentence, and (3) aspect-level, classify the sentiment with respect to the specific aspects of entities [28]. SA can be further categorized based on techniques used: the lexicon-based approach uses sentiment lexicon, which is a collection of known and precompiled sentiment terms; the machine-learning approach uses linguistic features and applies machine learning algorithms [18]. SA has been employed in different domains like product reviews [20], movie reviews [29], hotel reviews [30], news [31,32], consumers review [28], political debates [33], and social media posts (e.g., Facebook, Twitter, Weibo) [34,35]. One of the earliest works on sentimental analysis was conducted by [36], where the researchers performed sentiment analysis of movie reviews. The results showed that machine learning techniques perform better than the manual method. Dave et al. [37] presented an approach to mine the opinion hidden in a given piece of text. Here, the opinions of products were mined from the Web and analyzed using NLP techniques. Opinions were then divided into positive and negative sentiments by the algorithm, while feature opinions and context were taken into consideration. Jiang et al. [38] used SA to assess public opinion about a large infrastructure project, Three Gorges Dam. SA transformed textual data collected from people's posts on social media (Weibo) into emotional dimensions. In a study by Duan et al. [30], SA was employed to analyze user reviews to measure hotel service quality. Chang et al. [39] deployed SA to examine the influence of user comments, the number of views, and the number of likes on online video popularity. In another study, Lee et al. [28] applied a lexicon-based approach to explore customers' experience and satisfaction.
Recently, SA has received much attention from the education researchers. For example, Tseng et al. [40] used text sentiment analysis kit SnowNLP to evaluate 20,000 textual opinions obtained from teaching evaluation questionnaires for selecting outstanding teaching faculty. The results revealed that the text sentiment analysis system designed by the researchers could predict 97% positive sentiment and 87% negative sentiment. Hew et al. [41] adopted the SA approach to examine course features (e.g., course structure, course content, course instructor) of 249 MOOCs. They found that course instructor, content, assessment, and schedule significantly predicted student satisfaction. However, course major, duration, perceived workload, and perceived difficulty played no significant roles. In a recent study by [42], the aspect-based analysis was employed to analyze 105K students' reviews extracted from Coursera. The results revealed that the SA approach provided more accurate results as compared to the expensive manual approach. It is clear that SA is one of the most efficient techniques that can be used to assess people's views and opinions about a topic.

Methodology
We conducted SA in a series of steps. The first step involved the collection of data from different search engines using web scraping. Next, the collected data were refined for applying the models. Finally, the refined data were analyzed through the SA model (see Figure 1).

Dataset and Pre-Processing
In this study, we selected two widely used search engines, Google (https://www. google.com/) (accessed on 11 March 2020). And DuckDuckGo (https://duckduckgo.com/) (accessed on 11 March 2020). For Google, we extracted the required online news articles and blogs using the Advanced Search Option offered by Google Search Engine after clearing cache and removing location settings to avoid algorithm's impact on the results. We extracted the required dataset for 45 days starting from 11 March 2020, when the World Health Organization (WHO) declared Coronavirus disease a pandemic. We searched Google using the syntax combination: "COVID-19" OR "Coronavirus" OR "pandemic" "("online learning"|"remote learning"|"distance learning"|"blended learning"|"e-learning"|"Technology-enabled learning")" when: 45 days (see Figure 2).

Dataset and Pre-Processing
In this study, we selected two widely used search engines, Google (https://www.google.com/) (accessed on 11 March 2020). And DuckDuckGo (https://duckduckgo.com/) (accessed on 11 March 2020). For Google, we extracted the required online news articles and blogs using the Advanced Search Option offered by Google Search Engine after clearing cache and removing location settings to avoid algorithm's impact on the results. We extracted the required dataset for 45 days starting from 11 March 2020, when the World Health Organization (WHO) declared Coronavirus disease a pandemic. We searched Google using the syntax combination: "COVID-19" OR "Coronavirus" OR "pandemic" "("online learning"|"remote learning"|"distance learning"|"blended learning"|"e-learning"|"Technology-enabled learning")" when: 45 days (see Figure 2).
After the search results appeared, the HTML attributes were studied and, accordingly, the web scraping algorithm, Beautifulsoup, was applied. We followed the similar approach to search the dataset using DuckDuckGo. We selected only 154 online news articles or blogs that were present in both search engines. The articles were compiled in a dataset. The dataset was in the form of an excel sheet where the columns were: title of the article, the content of the article, URL of the article, and whether the article was online news or a blog. The dataset was then imported as a "Python Data frame." Web Scraping of ideal results Data pre-processing and noise removal Application of Sentiment Analysis model The next step involved pre-processing to remove unwanted noise from the textual data using the following steps:  After the search results appeared, the HTML attributes were studied and, accordingly, the web scraping algorithm, Beautifulsoup, was applied. We followed the similar approach to search the dataset using DuckDuckGo. We selected only 154 online news articles or blogs that were present in both search engines. The articles were compiled in a dataset. The dataset was in the form of an excel sheet where the columns were: title of the article, the content of the article, URL of the article, and whether the article was online news or a blog. The dataset was then imported as a "Python Data frame." The next step involved pre-processing to remove unwanted noise from the textual data using the following steps:

Sentiment Analysis
We conducted SA using the dictionary-based approach of the lexicon-based method. To analyze the people's opinions, we studied the polarity and subjectivity of articles. Polarity determines whether a text expresses a positive or negative or neutral opinion. It ranges between −1 to +1, where −1 is extremely negative sentiment and +1 is extremely positive sentiment. On the other hand, subjectivity classifies a text as a fact or opinion. It ranges between 0 and +1, where 0 indicates very objective, and +1 indicates very subjective (Yaqub et al., 2018). We employed TextBlob (http://textblob.readthedocs.org/en/dev/) (accessed on 12 May 2020), a framework developed by Loria [43], which has been widely used by the researchers [44,45] to conduct SA. TextBlob is a Python (2 and 3) library for processing textual data. Micu et al. [46] employed TextBlob to analyze customers' liking, rating and reviewing restaurants. They found that TextBlob is an effective tool sentiment analysis tool. Similarly, Hasan et al. [47] also advocated that TextBlob is one of the best tools to calculate polarity and subjectivity scores for textual data. In a recent study by [48], TextBlob was used to conduct the sentiment analysis of patients' feelings suffered from a rare disease.
In the lexicon-based approach, a sentiment lexicon is constructed using appropriate sentimental words, degree adverbs, and negative words, and the sentimental intensity and sentimental polarity [49]. A sentiment lexicon can be used to differentiate between objective facts and subjective opinions in a text. Each lexicon is given a score of polarity and subjectivity based on the context, the form of the word, and position in the sentence, e.g., adverb, adjective. For example, for the word "great" in a text, the following encoding is done (see Table 1). Therefore, the output for the 'great' is given according to the context in which the word is used. However, if only the single word "great" is called with no context, the model gives the output as the average of all "great" word lexicons. For the negation of the lexicon, the model multiplies the result of polarity with (−0.5) and does not affect subjectivity or the intensity. If the modifier words like "very" are added to the word "great." The model makes changes according to the following rules: • New Polarity = (Initial Polarity of the word 'great') * (intensity of the word 'very') • New Subjectivity = (Initial Subjectivity of the word 'great') * (intensity of the word 'very') Therefore, according to the above rules, the TextBlob gives scores on polarity and subjectivity on every identifiable lexicon in the provided phrase and ignores the words which are not in the document (e.g., proper nouns). After that, it averages the scores for every value of lexicon in the provided article. We applied the TextBlob library to our dataset to calculate the polarity and subjectivity of each article. Figure 3 shows that the polarity of the articles ranges from around −0.05 to 0.3 (see bar graph above the polarity axis). This implies that the articles written during this time have a polarity around neutral towards a somewhat positive side. The results showed that around 92.2% of the articles (142 out of 154 articles) are positive in nature i.e., their polarity scores are greater than 0. A possible reason behind this result is that educators and policymakers worldwide are advocating the potential advantages of online learning in this pandemic situation [50,51]. This motivates the instructors and students to adopt online learning in their formal education. There are few articles that have polarity scores are below the number 0, and even though they are on a negative scale, they are too close to the number 0. The possible reason could be the digital divide. People from rural places have less access to the Internet and other digital resources required for online learning [52]. This might have created anxiousness among them. Another reason could be that online learning may be new for a larger section of the people. Therefore, people are slowly adjusting to the online learning environment as they are accustomed to the traditional teaching and learning process. The people's opinion about online learning is positive, but the positivity depicted is not too high, showing doubts regarding the usefulness and effectiveness of online learning. Therefore, notwithstanding the digital divide, there is much scope in highlighting the benefits of online learning in media for mainstreaming online learning.  Figure 3 shows that the polarity of the articles ranges from around −0.05 to 0.3 (see bar graph above the polarity axis). This implies that the articles written during this time have a polarity around neutral towards a somewhat positive side. The results showed that around 92.2% of the articles (142 out of 154 articles) are positive in nature i.e., their polarity scores are greater than 0. A possible reason behind this result is that educators and policymakers worldwide are advocating the potential advantages of online learning in this pandemic situation [50,51]. This motivates the instructors and students to adopt online learning in their formal education. There are few articles that have polarity scores are below the number 0, and even though they are on a negative scale, they are too close to the number 0. The possible reason could be the digital divide. People from rural places have less access to the Internet and other digital resources required for online learning [52]. This might have created anxiousness among them. Another reason could be that online learning may be new for a larger section of the people. Therefore, people are slowly adjusting to the online learning environment as they are accustomed to the traditional teaching and learning process. The people's opinion about online learning is positive, but the positivity depicted is not too high, showing doubts regarding the usefulness and effectiveness of online learning. Therefore, notwithstanding the digital divide, there is much scope in highlighting the benefits of online learning in media for mainstreaming online learning.

Research Question 2
The subjectivity of the articles ranges from 0.3 to 0.6. From this result, it can be implied that most of the articles are written more on the factual side as the score 0 is assigned to the factual articles, and score +1 is assigned to the opinionated articles by TextBlob. We

Research Question 2
The subjectivity of the articles ranges from 0.3 to 0.6. From this result, it can be implied that most of the articles are written more on the factual side as the score 0 is assigned to the factual articles, and score +1 is assigned to the opinionated articles by TextBlob. We found that around 85.71% (132 articles out of 154 articles) of the articles were more factual in nature.
In our dataset, we have 82 blogs and the remaining 72 were newspaper articles. Blogs are generally more opinionated compared to the news articles. Due to a larger proportion of blogs in the dataset, the subjectivity of the dataset could have been shifted from the strictly factual. Therefore, we calculated the mean value for the subjectivity scores (see Table 2). The results showed that mean value of the subjectivity scores of the articles in around 0.43; the mean is towards the factual.

Research Question 3
The sentiment analysis was conducted for both the news articles and blogs separately. An independent sample t-test was employed to compare the subjectivity and polarity scores between news articles and blogs. There was a significant difference between mean subjectivity scores of the news articles (M = 0.42, SD = 0.06) compared to the blogs (M = 0.44, SD = 0.05); [t (152) = 2.39, p < 0.05] (see Table 3). The calculated effect size (Cohen's d) is 0.42, which is considered a large effect (Cohen, 1988). This result indicated that people's opinions in blogs are more opinionated as compared to the news articles. These results are consistent with the finding of Ku et al. (2006) in which they found that the nature of news articles are different from blogs. A possible reason is that the writing style differs in both news articles and blogs. Generally, news articles are more fact-based, formal, and lengthier whereas, in blogs, people express their personal opinions in an informal and shorter way. We also observed that most of the news articles provided statistics to support the argument. For example: News article (https://www.nasdaq.com/articles/e-learning-a-rising-trend-amidcoronavirus-crisis%3A-3-stocks-2020-04-09) (accessed on 11 March 2020).
The coronavirus pandemic continues to cast a pall over the global stock market. Per a report by Fitch, the world economy is expected to decline 1.9% in 2020, with the U.S., eurozone and U.K. GDP declining 3.3%  Table 4). The calculated effect size (Cohen's d) is 0.42, which is considered a large effect [53]. Figure 4 shows the polarity pattern for the news and blogs. The histogram for blogs is shifted towards the right than the news articles. This result indicated that people's opinion in blogs are more positive as compared to the news articles. In general, we can conclude that people have positive sentiments towards online learning.  There was a significant difference between mean polarity scores of the news articles (M = 0.11, SD = 0.06) compared to the blogs (M = 0.13, SD = 0.05); [t (152) = 2.61, p < 0.05] (see Table 4). The calculated effect size (Cohen's d) is 0.42, which is considered a large effect [53]. Figure 4 shows the polarity pattern for the news and blogs. The histogram for blogs is shifted towards the right than the news articles. This result indicated that people's opinion in blogs are more positive as compared to the news articles. In general, we can conclude that people have positive sentiments towards online learning.  Most blogs have negative opinions about the commercialization of online education by EdTech industries and the telecommunication companies' exploitation for overcharging the data cost. This is consistent with Williamson and Hogan [10], where they highlighted the concerns about the commercialization of the education that will further increase the digital divide. In addition, people expressed their concerns about access to digital devices, Internet connectivity, access to learning resources with different types of digital devices, and poor user experience with technology. Some blogs also expressed concern about the adverse effects of online learning for kindergarten students. However, most blogs appreciate that online learning ensures the continuity of the education in this pandemic situation by providing flexibility to access the contents. Some blogs mentioned that collaboration among the teachers would increase as the learning materials will be curated Most blogs have negative opinions about the commercialization of online education by EdTech industries and the telecommunication companies' exploitation for overcharging the data cost. This is consistent with Williamson and Hogan [10], where they highlighted the concerns about the commercialization of the education that will further increase the digital divide. In addition, people expressed their concerns about access to digital devices, Internet connectivity, access to learning resources with different types of digital devices, and poor user experience with technology. Some blogs also expressed concern about the adverse effects of online learning for kindergarten students. However, most blogs appreciate that online learning ensures the continuity of the education in this pandemic situation by providing flexibility to access the contents. Some blogs mentioned that collaboration among the teachers would increase as the learning materials will be curated and widely used. This pandemic situation has provided the opportunity for educators and policy makers across the world to work together to support the teachers by developing innovative teaching and learning methodologies to make learning more interactive and effective.
On the other hand, the news articles highlighted critical issues and challenges such as students' privacy, low and unstable Wi-Fi connections at home, cybersecurity, and time zone differences. Some news articles mentioned that the teachers are anxious and stressed about online learning as they need to come out from their comfort zone of traditional teaching and learning process. Instructors lack proper training to use technologies for online learning. News articles also expressed concern over the lack of digital resources and Internet issues. For example, in one news article, it is mentioned that around 35% of the students are not able to participate in the digital learning in South Africa because of the lack of resources. This is widening the equity gap. News articles mentioned that some companies like Microsoft, Google, have come forward to help instructors and students by providing free resources, courses, and training to facilitate remote learning. Some universities have adopted online learning not only for continuing formal teaching but also for other academic activities like virtual international conferences, internships, teacher training, job placement interviews, and hackathon.
While the COVID-19 pandemic has created a new opportunity for mainstreaming of online learning, public opinion will shape its actual adoption in practice by governments and educational institutions. Previous experience in climate change has shown the influences of media on practices, politics, and public opinion and understanding related to climate change [54]. Another study indicated that the more negative the media coverage and the more local this coverage, the greater the impact of discipling corporate pollutions [55]. Recently, a group of scholars from South Africa have highlighted the issues of equity and inequality in the "pivot" to remote teaching and learning [56]. Online media (both newspapers and blogs) will play a critical role in not only identifying the challenges but also influencing positive changes to assist the adoption of online learning worldwide, particularly in low-and middle-income countries.

Conclusions
In this study, we performed sentiment analysis to investigate public opinion about online learning. We analyzed the online news and blogs in the early days of the pandemic. Applying SA, along with web-scraping, is a new attempt to obtain insights into public opinion towards online learning. This study revealed the positive but cautious perceptions about online learning in public digital media with low polarity value. While this is a good starting point for an area that is changing fast, it also calls for more widespread sharing of the researches about online learning in public media to create strong public opinion and drive public policy in many countries to adopt online and blended learning as a means to build a resilient system. The overall low subjectivity scores support that an evidence-based approach to create public policy discourse is possible through digital media, even though the blogs in this study had higher subjectivity than news items.

Limitations and Recommendations
This study has limitations of a small data set and used early reports. However, as the pandemic continues and people become more experienced in using online learning in various innovative and unimagined ways, more critical and negative news may appear, as already being highlighted by Czerniewicz et al. [56]. Such deliberations would help improvements in the systems deployed and help mainstreaming of online learning. Therefore, in the future study, we plan to validate our approach on larger datasets. In addition, we will apply content analysis to gain further insights into the contents of news and blogs.
This study can be replicated regularly for understanding the overall sentiment about any topic, especially for public policy related to education.