The Conversation around COVID-19 on Twitter—Sentiment Analysis and Topic Modelling to Analyse Tweets Published in English during the First Wave of the Pandemic

: The COVID-19 pandemic disrupted societies all over the world. In an interconnected and digital global society, social media was the platform not only to convey information and recommendations but also to discuss the pandemic and its consequences. Focusing on the phase of stabilization during the ﬁrst wave of the pandemic in Western countries, this work analyses the conversation around it through tweets in English. For that purpose, the authors have studied who the most active and inﬂuential accounts were, identiﬁed the most frequent words in the sample, conducted topic modelling, and researched the predominant sentiments. It was observed that the conversation followed two main lines: a more political and controversial one, which can be exempliﬁed by the relevant presence of former US President Donald Trump, and a more informational one, mostly concerning recommendations to ﬁght the virus, represented by the World Health Organization. In general, sentiments were predominantly neutral due to the abundance of information.


Introduction
The SARS-CoV-2 pandemic has been the biggest properly global event since the popularization of social media. No other episode in this century has had the social and economic impact, duration, or media coverage of this pandemic, declared as such on 11 March 2020 by the World Health Organization (WHO 2020b). In a globally connected world, in which news travels fast and in which citizens can express their fears and opinions in an almost immediate and unfiltered way on their social media, these platforms have become a very powerful tool for understanding how societies perceived and lived not only through the pandemic but also the measures that were implemented to control it, including strict lockdowns and prohibitions of non-essential activities.
Social media platforms such as Twitter play a key role in communication scenarios, not only as new scenarios for sharing feelings but also as relevant sources of information, both for citizens and for journalists (Hermida 2010;López-Meri 2015). That is why text mining Twitter's unstructured data is becoming a very powerful tool for multiple research fields, including epidemiology. In fact, previous studies have observed the potential of Twitter conversations for the forecasting of flu incidence (Paul et al. 2014) or for dengue surveillance (Gomide et al. 2011). In the field of communication, the most frequent application of text mining on Twitter in relation to epidemics has been the study of public awareness, sentiments, and the conversation around them. This was visible during the 2014 Ebola (Lazard et al. 2015;Guidry et al. 2017) and 2015 Zika outbreaks (Pruss et al. 2019;Miller et al. 2017;Fu et al. 2016).
Given the deadlier and broader condition of the COVID-19 pandemic and the more generalized use of social media as a public agora, many studies of Twitter conversations around the new coronavirus have been published. Some have focused on the outbreak and the initial stages (Prabhakar Kaila and Prasad 2020; Jahanbin and Rahmanian 2020), Journal. Media 2023, 4 468 as well as on the infodemic 1 and the spread of misinformation surrounding the pandemic (Bridgman et al. 2020;Singh et al. 2020). Other common topics have been the use of Twitter during the pandemic by mass media (Yu et al. 2020) or as a tool for political communication (Yunez 2020). Later studies have also studied vaccination attitudes (Yurtsever et al. 2023).
Most of these works have taken national approaches, which our study seeks to overcome by collecting content in English, the most spoken language on the platform (Vicinitas 2018), independently from countries. Following the line of previous works, such as the ones by Mutanga and Abayomi (2022), Wicke and Bolognesi (2020), and Xue et al. (2020), our study offers an exploratory analysis with the capacity to detect transnational topics and characters.
Thus, together with this global dimension, the novelty of the present work is its focus on the months in which the first international wave, which affected most Western countries in March and April 2020, was starting to stabilize and the cases began to decline. So far, studies have focused on the Chinese outbreak-December 2019 to January 2020-or on the spread of the disease into most countries-February and March 2020-whereas others have tried to cover longer periods of time including the year 2021. However, our research focuses on a specific sample of 70 days between April and June 2020. This is relevant because it is the moment in which public conversation was broadening, introducing topics such as politics or economics besides health and the pandemic, which practically monopolized the conversation during the first weeks after the declaration of the pandemic (Yu et al. 2020;Yunez 2020). Thus, our research provides a broader understanding of the discussion by focussing on a determinant yet under-researched period.

Research Questions
All that being said, the main goal of this article is to understand the discourse around the COVID-19 pandemic published in English on Twitter once the first wave started to stabilize in Western countries. Following the agenda-setting theory (McCombs and Shaw 1972), which has been proven to also apply to social media such as Twitter (Lee and Xu 2018), it is relevant to identify which individuals or institutions are leading a conversation, as they will be in a stronger position to determine the topics in the agenda that will be discussed. That is why the first step will be to study who the most active public actors were, as well as the most influential, based on the public metrics of retweets and mentions.

1.
Which public accounts were most active and influential on Twitter's discourse in English around COVID-19 during the stabilization and decline of the first wave of the pandemic in Western countries?
In connection with this agenda approach, one of the most common attempts among studies analysing Twitter conversations around the COVID-19 pandemic has been the detection of topics. This is of great relevance because the topics addressed by citizens and their conversations can influence their posterior behaviours and their attitudes towards health issues, such as vaccines, or the level of trust in the measures implemented by governments to control the spread of the virus (Lim et al. 2020;Gozgor 2022).
The use of topic modelling has been common for studying the conversation around the pandemic on social media. Prabhakar Kaila and Prasad (2020) focused on the first stages of the pandemic and observed that Twitter is considered "one of the most preferred media for information spread during pandemics" (p. 133) and that misinformation did not play a great role at that moment. Yu, Lu, and Muñoz-Justicia (Yu et al. 2020) studied the frames used by Spanish news media on Twitter before, during, and after the lockdown. Mutanga and Abayomi (2022) analysed a variety of topics and the surge in fake news and conspiracy theories around the pandemic and the lockdown in South Africa. Alamoodi et al. (2022), Gourisaria et al. (2022), and Mathayomchan et al. (2022) offer very complete approaches, combining topic modelling with sentiment analysis in Malaysia, India, and Southeast Asian countries, respectively. These studies offer a basis for our work, but their national or regional dimension is replaced by a completely international approach.
Before addressing the latent topics, it is also interesting, as a preliminary observation, to identify the most frequent terms, as they can offer complementary hints for the topic modelling interpretation. Thus, we wonder about the following:

2.
What were the most frequent words used in tweets around COVID-19 published in English during the stabilization and decline of the first wave of the pandemic in Western countries? 3.
What topics were used in Twitter's discourse in English around the COVID-19 pandemic during the stabilization and decline of the first wave of the pandemic in Western countries?
Finally, in addition to agenda theory, framing theory (Goffman 1974) should also be considered, as it is relevant to determine the attributes used to discuss and frame a topic. This theory justifies the study of sentiments, as they provide the most relevant frame to analyse the pandemic. Previous works that have focused on the predominant sentiments on Twitter about the pandemic are the aforementioned ones by Alamoodi et al. (2022), Gourisaria et al. (2022), and Mathayomchan et al. (2022), as well as one by Lwin et al. (2020), which paid attention to the first stages of the pandemic. With the goal of studying the period in which the conversation around the pandemic was going beyond purely health issues, as well as to offer a more international understanding of the phenomenon, the following research question is posed:

4.
Which is the predominant sentiment in tweets around COVID-19 published in English during the stabilization and decline of the first wave of the pandemic in Western countries?

Methods
A completely computational strategy was carried out, both for the search, download, and collection of the sample, as well as for its analysis. This methodological design allowed for the exploration and identification of the sentiments and topics underlying all international English-language tweets collected in the defined period. Each of the techniques developed in this work is detailed below.

Data Collection
The selected sample covers tweets in English including the terms "coronavirus", "covid-19", "covid_19", "covid2019", or "covid19" posted on Twitter from 13 April to 22 June. The selection of these dates seeks to cover the stabilization phase of the first wave of the pandemic in Western countries. More specifically, on 13 April, the UK reached its peak of daily new confirmed COVID-19 deaths (7-day rolling average) for the first wave according to figures from Johns Hopkins University data 2 , and it was on 22 June when the Health and Social Care Secretary of the United Kingdom, Matt Hancock, announced that, the next day, the Prime Minister would set out the next steps to ease the national lockdown 3 . Although the selection of these specific dates is based on the UK's context, other European countries, as well as the United States, were following a similar trend: the peak of daily deaths for the first wave in Europe took place on 10 April, while the one in the USA took place on 24 April, and in both contexts, the curve was reaching its lowest levels at the end of June.
To download the sample, the Python programming language was used to access Twitter data through its REST API, since, at the time, it was not possible to access the academic API 2. This means that it was not possible to access the total number of tweets posted on those dates, so we sampled the tweet history, downloading between 5000 and 10,000 tweets per day within 10 days after their publication. For this, a strategy already designed by the authors in previous works was executed (e.g., Arcila-Calderón et al. 2017;Arcila-Calderón et al. 2022). Specifically, a language filter was used to ensure that all the collected tweets were written in English. All retweets and replies were also filtered out, thus collecting only original tweets. In total, 436,296 tweets in English published during the indicated period were downloaded, along with their aggregated data. This initial sample was used to extract exploratory data about the users and their public metrics, such as the total number of posts, retweets, and mentions in order to answer RQ1. After this, the downloaded dataset was cleaned, rejecting all duplicate or repeated messages, as well as those that did not contain textual information-tweets with only emojis, links, or empty content. The final sample used for the topic modelling included a total of 180,509 tweets (after removing 255,787).

Word Frequency Distribution
After having collected and cleaned the tweet dataset, the first step was to apply basic techniques of natural language processing (NLP) to obtain the frequency distribution of words. NLP is a branch of computer science that is combined with applied linguistics and seeks to convert a text into a set of structured data that describe its meaning and the topics it transmits (Collobert et al. 2011). For this study, different Python libraries were used, such as Numpy, which adds greater support for vectors and matrices, as well as the Natural Language Toolkit (NLTK), which defines an infrastructure that allows for the development of NLP scripts.
Word frequency was used as a preliminary step for the identification of underlying topics after a filtering process. Knowing the most frequent words in the sample offers valuable exploratory information, useful for a better interpretation of the results of the topic modelling. The first step to correctly perform NLP techniques was the identification of tokens-the basic units-typically words or short sentences into which a text can be deconstructed for further analysis. For this process of tokenization, the aforementioned NLTK library was used. The next step was the removal of stop-words, which are those very frequent and common words that do not provide relevant information, such as articles or prepositions. In this phase, after running several tests, we removed the words and combinations of words that made up the search terms, as well as some directly related to COVID-19, since, as expected, these were too frequent and could bias the results. In specific, the removed terms were coronavirus, covid, covid 19, covid19, covid_19, covid-19, and covid'19. Punctuation marks and weblinks were also removed to avoid the repetition of terms and obtain homogeneous and coherent findings. Finally, we were able to look at the most repeated terms and their distribution and decide how many topics it was convenient to obtain. Once these adjustments were made and the stop-words removed, word clouds were generated with the most frequent words in order to better visualize the most and least frequent terms in the analysed messages.

Topic Modelling
Given the promising results and the growing trend of topic modelling use among text mining techniques (Bogović et al. 2021;Wright et al. 2022), particularly for Twitter (Karami et al. 2020), this method was employed for the detection of topics around which public discourse on Twitter related to the COVID-19 pandemic was built. In this case, the authors also followed a strategy developed in previous works to identify the main underlying topics in a dataset (e.g., Latorre and Amores 2021). Specifically, the Latent Dirichlet Allocation (LDA) algorithm was used, the most common for the identification of topics in a set of documents (Ramage et al. 2009;Grimmer and Stewart 2017). With this technique, topics are detected by automatically identifying patterns in the presence of groups of concurrent words in the documents (Jacobi et al. 2016). In this case, in addition to NLTK, the following Python libraries were used: Pandas, used for data analysis; Gensim, used for the topic modelling; and pyLDAvis, used for displaying inferred topics on maps. After importing all the requested libraries and modules and selecting the sample to model, the next step, once again, was to convert all text to lowercase and remove the punctuation marks, double spaces, and stop-words-a total of 864-to achieve a higher level of coherence in the identified topics. After this cleaning process, internal coherence values were extracted, which allowed us to decide the total number of topics that should be inferred. Similarly, the pyLDAvis library allowed us to print interactive display maps to visually explore the results of the modelling, which also helped to more reliably select the number of latent topics to detect. With all this, it was decided to model a total of 6 topics, as it was the most coherent number according to the visualizations and the number that presented the highest internal coherence (0.387). Finally, a manual validation was carried out exploring the tweets in which the different topics were most predominantly present.

Sentiment Analysis
The last stage was the identification of the latent sentiments in the sample. We used SentiStrength, an open-source tool developed by Thelwall et al. (2011) that allows for automatic sentiment analysis from lexicon dictionaries. Specifically, this validated software rates the relevance and presence of negative words (from −1 to −5) and positive words (from +1 to +5) for each text. The sum of these two values indicates the general emotions of the tweet in terms of language (language sentiment). To report global results regarding latent sentiments, the total mean of the coefficients obtained was extracted, as well as the percentages of all tweets with positive (from +5 to +1), negative (from −1 to −5), and neutral (0) sentiments, the last ones usually being purely informative texts.

Most Active and Influential Twitter Users in the Conversation around COVID-19 during the Decline of the First Wave of the Pandemic
The public metrics data extracted from the original dataset with all 436,296 downloaded tweets answered RQ1. Specifically, the number of tweets posted during the period allowed us to identify the most active users talking about COVID-19 in English. The number of retweets and mentions was also observed to identify the most influential users. The number of followers and followed users by each of those accounts, the date of creation, and the declared country were used to determine their nature and whether they were public figures, common users, or potential bots and trolls.
It analyzes worth mentioning that among the most active users, one stands out, with a total of 727 posted tweets, but that account is no longer active, so it might have been removed or blocked by Twitter. Among the 10 most active accounts, 3 of them were deleted by the time of the analysis. It should also be noted that a total of three accounts include the word "bot" in their usernames. One of those accounts is no longer active, and another has only four followed users and was created in January 2020, just at the beginning of the COVID-19 pandemic. In addition, among the most active users, none seems to stand out as a public figure, with the possible exception of HO_Wrestling, an alleged news account about wrestling, as well as two accounts from MyNation Foundation members, an alleged "non-profit association of Self Help Support for Dowry Law victims". The total number of tweets posted by these most active accounts, together with their public metrics, are shown in Table 1. Among the most influential accounts in the dataset, profiles of international public actors can be recognized. The first and most influential account is that of Donald Trump, which was later blocked and deleted by Twitter. In total, the posts about COVID-19 published by the former US President accumulated 15,667 retweets and mentions during the analysed period, more than 2.5 times as much as the second most influential account. Other relevant political figures or organizations from the USA present among the 10 most influential accounts are the Speaker of the House of Representatives of the US, Nancy Pelosi, and Dr Dena Grayson, as well as the Lincoln Project, a political committee formed in 2019 by several prominent Republicans and former Republicans with the objective of preventing the re-election of Donald Trump (Young 2020).
The second account in terms of influence is that of the World Health Organization, with 5820 retweets and mentions, something that is unsurprising during a health crisis. Other well-known personalities among the 10 most influential accounts were the writer Stephen King and the filmmaker Ava Duvernay. Another relevant account in the list is that of the Nigeria Centre for Disease Control. The number of retweets and mentions that the tweets posted by these accounts had in the original dataset, together with the public metrics of these users, can be seen in Table 2.
Journal. Media 2023, 4, FOR PEER REVIEW 8 covid19, covid_19, or similar variations. Consequently, the different frequencies of appearances of the rest of the terms can be better identified. The following 50 were the most frequent terms; Figure 2 shows the frequency of appearance of the first 30 terms.
Secondly, a large number of the most frequent words relate to politics and state issues, such as trump, realdonaldtrump, government, state, china, uk, country, or right. It can be seen that the surname and handle of the Twitter account of the former US President, Donald Trump, appear on this list. This shows what a relevant presence Donald Trump had in the conversation around COVID-19 at the end of the first wave of the pandemic, at least in the English-speaking context. Among these most frequent words, we also find the name of two countries, China and the UK, which also shows the prominence of both in these conversations about COVID in English, something to be partly expected considering that China was the country in which the virus originated and that tweets in English are being analysed, so the relevance of the UK, especially as it was the first of the English-speaking countries to suffer the most serious consequences of the pandemic, is understandable. Finally, we find a series of words with a more positive and encouraging tone, which seems to refer to the importance of the public and social union, as well as the need to fight and work together to overcome the pandemic. Some of these words are work, workers, positive, social, fight, public, good, and community. Figure 3 shows two word clouds with each of the referred samples, one in which the terms related to COVID-19 are maintained and the other without these terms.
Journal. Media 2023, 4, FOR PEER REVIEW 9 to refer to the importance of the public and social union, as well as the need to fight and work together to overcome the pandemic. Some of these words are work, workers, positive, social, fight, public, good, and community. Figure 3 shows two word clouds with each of the referred samples, one in which the terms related to COVID-19 are maintained and the other without these terms.

Predominant Topics in the Conversation about COVID-19 during the Decline of the First Wave of the Pandemic
After obtaining the frequency distribution for the analysed tweets, topic modelling was conducted to automatically detect the main underlying topics in the conversation about COVID-19 that took place on Twitter during the analysed period, thus answering

Predominant Topics in the Conversation about COVID-19 during the Decline of the First Wave of the Pandemic
After obtaining the frequency distribution for the analysed tweets, topic modelling was conducted to automatically detect the main underlying topics in the conversation about COVID-19 that took place on Twitter during the analysed period, thus answering RQ3. The level of coherence was measured-the further from zero, the better-to determine an adequate number of topics, comparing several models with ten words for each topic, and we finally decided that the adequate number of topics was six. After removing the stop-words, the topics were detected and validated by exploring examples of tweets for each one: Topic 1. Information about the pandemic (Figure 4). This topic focuses on cases, infections, incidence rates, deaths, lethality, and virulence. It may be health information offered by international politicians and institutions to control and combat the health and economic crisis or journalistic information on statistics and surveys; there are also advice and recommendations to fight the virus, and caution is requested. The most representative words are the following: ('0.028*"cases" + 0.021*"new" + 0.017*"positive" + 0.015*"pandemic" + 0.010*"deaths" + 0.010*"health" + 0.009*"today" + 0.008*"people" + 0.008*"virus" + 0.008*"mask" + 0.007*"masks" + 0.007*"campaign" + 0.006*"day" + 0.006*"time" + 0.006*"total" + 0.005*"testing" + 0.005*"staffers" + 0.005*"think" + 0.005*"help" + 0.004*"death"') Here are some examples of tweets on this topic: • "Released today: a free information book explaining the #coronavirus to children, illustrated by Gruffalo illustrator #AxelScheffler" • "Today @UNDP has an even greater role to play in shaping responses to #COVID19, I told Administrator @AchimSteiner in our discussion this evening on how best Maldives & @UNDP can partner to control the virus. Also thanked him for his leadership in highlighting challenges #SIDS face https://t.co/qLJlQXamJo" • "USAID donated two ambulances to the Rizgary Hospital today to support #Erbil Health Directorate's response to #COVID19. The U.S. continues to provide key resources to help save lives, build health institutions and reduce delays in communities receiving critical medical attention. https://t.co/pAPXCqRuhg"

Journal. Media 2023, 4, FOR PEER REVIEW 10
• "USAID donated two ambulances to the Rizgary Hospital today to support #Erbil Health Directorate's response to #COVID19. The U.S. continues to provide key resources to help save lives, build health institutions and reduce delays in communities receiving critical medical attention. https://t.co/pAPXCqRuhg" Topic 2. Information on the health and political crisis, specifically in the US, the country that dominates the discourse, with the figure of Trump (RealDonaldTrump) as the main protagonist ( Figure 5). These are not only messages launched by government and institutions about the pandemic but also responses to those messages and discussions with a more politics-related approach than a purely health-related one. This topic also includes the responses of citizens to the management of the pandemic; many of the messages are Topic 2. Information on the health and political crisis, specifically in the US, the country that dominates the discourse, with the figure of Trump (RealDonaldTrump) as the main protagonist ( Figure 5). These are not only messages launched by government and institutions about the pandemic but also responses to those messages and discussions with a more politicsrelated approach than a purely health-related one. This topic also includes the responses of citizens to the management of the pandemic; many of the messages are direct criticism of the Trump government and its handling of the pandemic. The main words are ('0.043*"trump" + 0.027*"rally" + 0.026*"people" + 0.015*"realdonaldtrump" + 0.012*"going" + 0.009*"state" + 0.007*"home" + 0.007*"make" + 0.006*"work" + 0.006*"want" + 0.006*"social" + 0.006*"covidiots" + 0.006*"reported" + 0.005*"tulsatrumprally" + 0.005*"lives" + 0.005*"crowd" + 0.005*"stay" + 0.005*"states" + 0.005*"know" + 0.005*"president"') Some examples of tweets on topic 2 are the following: • "Trump just threw a mega tantrum, cutting all funding to the World Health Organisation -in the middle of the #Covid19 pandemic! Now this massive public call to save the WHO is going viral! https://t.co/BQCyOkQ74w" • "Trump delayed action on #Covid_19 so his buddies could sell off certain stocks. See, some of us are mistaken about who he is there to represent. Spoiler alert! It is not the 99%" • "@realDonaldTrump Bill Gates, what a benevolent and kind person. Thanks for not feeding the starving masses or storing some PPE for the world. Oh thanks for your $100m donation for vaccines you will profit from. I don't give a shit if they jail me but know this #youcanshoveyourvaccine #covid19" Journal. Media 2023, 4, FOR PEER REVIEW 11 Figure 5. Interactive map of topic 2.

Sentiments in Tweets about COVID-19 during the Decline of the First Wave of the Pandemic
Finally, using SentiStrength, we conducted, first, a sentiment analysis with the total sample and, second, a longitudinal analysis, dividing the original sample into the 10 weeks of data collection. Considering the 180,509 clean tweets, a total of 41,160 messages had positive feelings (22.80% of the total), compared with 61,204 tweets with negative

Sentiments in Tweets about COVID-19 during the Decline of the First Wave of the Pandemic
Finally, using SentiStrength, we conducted, first, a sentiment analysis with the total sample and, second, a longitudinal analysis, dividing the original sample into the 10 weeks of data collection. Considering the 180,509 clean tweets, a total of 41,160 messages had positive feelings (22.80% of the total), compared with 61,204 tweets with negative feelings (33.90%) and 78,146 completely neutral ones (43.29%). This is possibly explained by a large amount of merely informational tweets about the health crisis and everything that surrounds it. The mean of positive sentiments in the entire sample was 1.511, while the mean of negative sentiments was −1.744, which provided an overall mean result of −0.233, that is, a slightly negative trend, although close to neutrality.
Longitudinally, no large changes were observed; the mean sentiment was always negative and ranged between −0.160 during the most positive moment in the sixth analysed week-from 18 to 24 May-and −0.289 in the most negative moment during the following week-from 25 to 31 May. Figure 10 shows the evolution of the average sentiment throughout the period.

Sentiments in Tweets about COVID-19 during the Decline of the First Wave of the Pandemic
Finally, using SentiStrength, we conducted, first, a sentiment analysis with the total sample and, second, a longitudinal analysis, dividing the original sample into the 10 weeks of data collection. Considering the 180,509 clean tweets, a total of 41,160 messages had positive feelings (22.80% of the total), compared with 61,204 tweets with negative feelings (33.90%) and 78,146 completely neutral ones (43.29%). This is possibly explained by a large amount of merely informational tweets about the health crisis and everything that surrounds it. The mean of positive sentiments in the entire sample was 1.511, while the mean of negative sentiments was −1.744, which provided an overall mean result of −0.233, that is, a slightly negative trend, although close to neutrality.
Longitudinally, no large changes were observed; the mean sentiment was always negative and ranged between −0.160 during the most positive moment in the sixth analysed week-from 18 to 24 May-and −0.289 in the most negative moment during the following week-from 25 to 31 May. Figure 10 shows the evolution of the average sentiment throughout the period. Answering RQ4, it can be confirmed that the predominant sentiment in tweets about COVID-19 in English published at the end of the first wave of the pandemic was generally neutral but with a trend towards negative feelings. This can be associated with the observations made in our study of the most frequent words, as many of them had to do with the health crisis and how to combat it.

Discussion and Conclusions
This paper analysed the public conversation around COVID-19 that took place on Twitter during the weeks in which the first wave of the pandemic was receding in Western countries and the conversation was broadening beyond purely health topics. A large set of tweets related to the disease and published in English from 13 April to 22 June was downloaded, allowing us to access the conversation that took place mainly in the USA and in the UK, two of the countries with the most registered cases and deaths related to the virus during that first wave. Apart from the manual exploration of the collected public data extracted using Twitter's API, analyses based on computational techniques, such as word frequency distribution, topic modelling, and sentiment analysis, were used to produce more valuable information.
Among the main findings, we can highlight the relevance of Donald Trump as a key actor during this period, not only as one of the most repeated terms and as a central figure in the most important topics but also because his account was one of the most active and influential. In this sense, it should be noted that some months later, and due to the hostile behaviour of the former president spreading false or doubtful information and encouraging violent behaviours, especially in relation to the presidential elections that he lost in November 2020, the platform decided to block and delete his profile in January 2021. This helps us understand how a polarizing figure, who was also accused of not implementing adequate control and prevention measures against the virus, could lead the conversation around the pandemic. In this context, it is also important to keep in mind that 3 of the 10 most active accounts no longer exist and that 3 included the word "bot" in their profile names, which indicates that they might have been bots, although that cannot be confirmed. At any rate, this shows that conversations on Twitter might have been led by instability, lack of reliability, and confrontation.
On the other hand, Twitter was also a space for what we can consider a more useful conversation, given that the second most influential account in the sample was that of the World Health Organization, the main institution in charge of informing people about the pandemic and the one offering the most important recommendations at an international level. In fact, information and recommendations seem to have been essential during this uncertain period, in which the Twitter conversation also looked for data and ways to face the pandemic, something that the sentiment analysis, the word frequency, and the topic modelling confirm.
The word frequency analysis revealed that the most frequent words in the sample refer mainly to health-related topics, such as the evolution of the pandemic or the measures taken to fight it, once again indicating that most of the tweets could be informative. Part of the conversation also focuses on US politics, showing that the discussion, originally strictly focused on health elements (Yu et al. 2020;Yunez 2020), was broadening. Nonetheless, it might be surprising that no economic issues-strongly affected by the measures taken to control the virus-seem to be present.
The topic modelling of the sample of tweets confirms the prevalence of informative messages. Of the six main identified topics, three of them refer to general or specific information messages about the coronavirus, its effects, damage, evolution, and ways to control and combat it. Some of the topics also share a strong component of confrontation, criticizing measures or the behaviour of other people, as well as referring to protests or politically charged messages.
Following the postulates of the agenda-setting theory, it can be observed that the main topics strongly relate to observations about the most relevant figures-polarization around US politics and its handling of the pandemic and general recommendations to fight the virus, mainly coming from the WHO.
Finally, regarding the latent sentiment of the tweets, it was found that these sentiments are predominantly neutral, possibly even informative, although there is a slight tendency towards negativity, something that is understandable during a hard moment in which citizens were suffering the effects of the pandemic. Furthermore, although the number of cases were declining during the studied weeks, no relevant trends were observed in the evolution of the sentiment of the conversation.
As a general conclusion, it can be observed that the conversation around COVID-19 during the weeks in which the first wave of the pandemic was receding in most Western countries had two perspectives: first, a rather informative one, with neutral sentiments, led by public institutions or media with information, data, or recommendations about the pandemic; second, a more polarized and political one, with confrontation and complaints due to the mismanagement of institutions or irresponsible citizen behaviours. The most paradigmatic accounts for each type of conversation are the ones of the WHO and Donald Trump, respectively.
One relevant aspect to highlight is the possibility that bots participated in these conversations and transmitted information, which suggests that some of that information might have been fake or manipulated, leading to misinformation or polarization; this matches previous studies that have focused on misinformation or other information disorders during the pandemic (Bridgman et al. 2020;Singh et al. 2020), and it points out the important role that the infodemic might have played during this phenomenon.
Another important observation is the predominance of the USA in the public debate. The selected tweets had no country identification, but the conversation clearly focuses on the USA, despite the presence of or allusions to countries such as China, the UK, or Nigeria. Besides its large population and the great penetration of Twitter in the US, its worldwide influence, the discussed management of the pandemic by the Trump administration, the upcoming presidential elections, or the strong impact of the pandemic in this country can help explain this presence.
Finally, it is important to highlight the limitations of this work. Although the paper is extensive and a large sample has been collected and analysed from different perspectives, there are still both temporal and methodological limitations. On the one hand, not all the tweets published on the selected dates have been used; a random sample according to what the Twitter REST API allowed was used, so it would be advisable to use API 2 to access all the messages in future studies. It would also be advisable for future works to analyse a longer period, including the consecutive waves of the pandemic, and compare them with the specific moment studied here, studying this public debate around the coronavirus longitudinally. This will provide a better understanding of how international public opinion has evolved and how this has affected the institutional decisions and different measures taken to fight the pandemic, as well as how these have impacted conversations on Twitter. Of course, in future studies, it would be necessary to collect data from other social media sites, as well as include conversations in more languages, such as Spanish and Italian, spoken in the two Western countries where the virus first arrived, which were two of the countries hardest hit by the pandemic during the first wave. Similarly, given that sentiment analysis and topic modelling by themselves are not entirely adequate in identifying and analysing predominant frames in discourses and the formation and establishment of agendas, it would be also convenient to carry out analyses using other methods, such as network analysis or qualitative techniques, trying to identify ghost accounts that could be participating in these public debates, as well as delving into the different topics and discourses spread through these platforms and their possible effects on society.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author. The data are not publicly available due to their restricted access only using Twitter's API.

Conflicts of Interest:
The authors declare no conflict of interest.