2. Literature Review
Georeferenced Twitter data have been used for the analysis of many different use cases, such as natural disasters [
8], refugee crises [
9], or epidemiology [
10]. In addition to statistical analyses or topic modelling approaches for semantic classification, sentiment analysis methods are also used frequently. Sentiment analysis is a branch of research that analyses people’s feelings, opinions, evaluations and emotions about certain objects or persons and their characteristics from textual data [
11]. It aims to assign general sentiment labels to a data set [
12]. It is commonly understood as a sub-discipline of natural language processing (NLP) and semantic analysis [
13].
In psychology, a distinction is often made between emotion, i.e., a distinct feeling, and sentiment, which can be understood as a mental attitude based on an emotion. However, this distinction is quite blurry, which also holds true for the difference between sentiment and opinion. Sentiment can be defined as a combination of domain dependent type, valence (positive, neutral, and negative) and intensity. It is important to note that the object to which such a sentiment can refer can also vary greatly. Sentiment analysis can thus be performed on different levels of granularity, i.e., referring to entire texts, sentences or entities, such as singular words [
11]. As with other machine learning approaches, there are both supervised (e.g., simple decision trees) and unsupervised methods [
14]. In recent years, deep learning based methods have also been developed. Here, convolutional neural networks (CNNs) [
15], recurrent neural networks (RNNs) [
16] and deep belief networks (DBNs) have been applied for sentiment analysis tasks [
17]. There are some general challenges, such as the handling of sarcasm, ambiguous word meanings, the absence of sentiment words, or the ignorance of pronouns [
11,
12].
There have been many studies on a plethora of topics that use sentiment analysis methods on social media data, mainly from Twitter. For example, [
18] utilise Twitter data to analyse the sentiments surrounding the 2012 London Olympics. For this, they consider different timestamps and user groups and employ the Hu Liu lexicon of positive and negative words. In a study on the perception of urban parks, [
19] use a partitioning around medoids clustering algorithm to combine sentiment scores and emotion detection results. Another study on the subject employs a graph-based semi-supervised learning approach [
20]. In a study on Brexit, [
21] use ensemble learning to identify pro- and anti-sentiments. With the help of Twitter data and the VADER (Valence Aware Dictionary and Sentiment Reasoner) lexicon, the online reception of the murder of the Slovak journalist Ján Kuciak is examined by [
22]. There has also been research on economic phenomena using Twitter data. Valle-Cruz et al. [
23] analyse the importance of influential Twitter accounts on the behaviour of financial indices, finding that markets reacted 0 to 10 days after critical information was posted on Twitter during the COVID-19 pandemic. Similarly, [
24] create social sentiment indexes for large publicly traded companies from Twitter data to describe effects of negative information on stock prices. Various deep learning methods have also been applied for sentiment analyses. For instance, [
25] use a bidirectional long short-term memory (BiLSTM) to classify sentiments regarding COVID-19, obtaining very high accuracy and sensitivity. The COVID-19 pandemic in general has been a popular research object for social media-based sentiment analysis. Using the AFINN lexicon, [
26] analyse opinions towards different COVID-19 vaccines, identifying that the sentiment towards some vaccines has turned negative over time. Using a logistic regression classifier, [
27] split Tweets on COVID-19 vaccinations in Mexico into positive and negative. A similar study in Indonesia is also performed by [
28], who employ a naive Bayes classifier.
To our knowledge, this is the first scientific paper that deals with the reception and perception of the 2022 Twitter acquisition. As it is, therefore, still unclear how the acquisition of Twitter by Elon Musk was received on the platform itself, there is a distinct research gap. The wide range of content in studies that use emotions and opinions from Twitter, outlined in this section, highlights the importance of having a clear understanding of Twitter data characteristics. Filling the aforementioned research gap is therefore of great significance for future research with Twitter data, as the acquisition might result in a change of contents, user structure and attitude. The study presented in this paper provides an initial indication of how Twitter might develop in the future.
5. Discussion
In the following section, we will discuss our results, giving some interpretations. Furthermore, we will critically evaluate our methodological approach.
5.1. Discussion of Results
Our results can be understood as a first indication of how the discourse on the 2022 Twitter acquisition was perceived by users. We identified an increase in Tweets about our two study objects, i.e., Twitter and Elon Musk, in general after 27 October 2022. We also found that after this date, there were consistently more negative than positive Tweets. The dates of 18 and 20 November have to be singled out here, where there were particularly many negative Tweets. We assumed that this was mainly related to the high number of employees dismissed by Elon Musk around this time. However, other reasons are also conceivable, such as the reinstatement of accounts of controversial figures such as Donald Trump.
As we focused on georeferenced Tweets, the spatial perspective was of high importance for our analysis. By performing spatial hot spot analyses, we tried to identify geographic patterns of sentiments on the Twitter acquisition. In general, there was a higher number of counties regarded as hot spots in the second timeframe. This was consistent with the sheer number of Tweets that increased after the acquisition and could be understood as the discourse about Twitter gaining momentum across the entire US. However, we could not identify a clear spatial clustering of this, i.e., there were no large regions changing from being hot spots of positive to negative sentiments or vice versa. Nevertheless, we found that there were more hot spots of negative sentiments on the East Coast after the Twitter acquisition, at least in the New England region. In particular, the emergence of a strong hot spot in Maine should be mentioned here, which might, however, be biassed due to the spatial neighbourhood implementation and the low number of neighbouring observations. However, there were also some regions where negative sentiment hot spots before the acquisition turned into regions with generally low attribute values of sentiment Z scores. Many regions across the US also remained quite similar in both timeframes. This also applied for some counties that were only hot spots of positive sentiments. An interesting case was the West Coast, particularly the state of California where the headquarters of Twitter are located. Before the acquisition, its counties were mostly classified as medium hot spots of negative sentiments. After the acquisition, the predominating category was the medium class for both sentiments. This could be understood as an indication for a polarisation of the discourse in this region.
The highly positive scores of the Z scores were not too surprising, as the Getis Ord-Gi* method used for the spatial hot spot analysis is somewhat based on Moran’s I. Nonetheless, the increase in Moran’s I values might be understood as a strengthening of the clustering of sentiments. Moreover, they are statistical evidence that negative sentiments were more spatially clustered than positive ones.
In addition, we examined whether there is a connection between our results and the US political landscape, i.e., the election results of Democrats and Republicans, as this is often reflected in social variables in the US. However, we could not find any clear evidence for this. A closer look, also on a textual basis, might still be interesting for the future.
5.2. Limitations
Our analysis also had some limitations that often occur in research with Twitter data. The official Twitter API does not allow access to the entire data set, i.e., the millions of Tweets sent daily, but only to a representative sample. It is not entirely clear how this sample is collected, which of course means that a bias on our results cannot be ruled out. However, since numerous studies have already been able to make significant statements with such a representative sample, we assumed that this also applied in our case.
Another problem specific to georeferenced Tweets is their geometry. In some cases, this is provided by Twitter as an exact point. However, the majority of Tweets only receive a bounding box based on a location specified by the user. Of course, this creates quite a bit of inaccuracy, as this location can also be large regions such as entire states or even countries. We tried to address this inaccuracy by only keeping Tweets whose geometry was completely within the respective county. This, however, resulted in some unwanted data loss. In the future, the data quality could be increased by additionally geocoding location mentions in the Tweet text [
34]. We also found 78 Tweets whose place was specified as “Twitter HQ”, wherefore they were georeferenced in San Francisco. However, we considered it quite unlikely that these Tweets were actually posted from there, but that the users only gave this location as a reference to the content of their tweet.
In our case, the pure use of keyword-based filtering can also be cited as a valid criticism. Tweets that do not contain a keyword but nevertheless semantically relate to the topic under investigation cannot be taken into account here. However, we assumed that Tweets relating to the Twitter acquisition were very likely to contain either “twitter” or “elon musk”, which is why we considered a keyword-based approach sufficient for our study. We also created the requested variants of the keywords in such a way that they appeared as infrequently as possible in other terms. For example, using only the first name “elon” without a leading or trailing space would also have picked up Tweets with the words “belong” or “elongate”, i.e., presumably unrelated Tweets.
Furthermore, our choice of timeframe could be criticised. A longer study period would definitely be very interesting, particularly extending from November 2022 onward. We chose the date when we started our analysis as the end data of our timeframe. However, it would have been possible to split our entire timeframe in smaller units than before and after the acquisition. Since it would have been quite arbitrary (e.g., one week) and more difficult to reason, we discarded this idea.
Our paper focused solely on a polarity-based sentiment analysis. Therefore, we were only able to account for positive, neutral and negative sentiments. As stated before, we also only performed a document-level analysis which resulted in information loss, e.g., when multiple sentiments cancelled each other out. As shown in the literature review, many more complex methods of sentiment analysis exist. With these, it would, for example, be possible to assign concrete emotions (e.g., anger, and joy) to Tweets. With aspect-based methods, this can even be applied to topics within the Tweet text. Consequently, such a more complex analysis could be performed in the next step.
Another point of criticism concerns our choice of representation of the spatial neighbourhood in the hot spot analysis, i.e., the KNN spatial weights matrix (k = 8). Especially in regions where the number of neighbouring counties was small, this could have created a bias. However, this is also reserved for other methods of spatial neighbourhood delimitation and falls under the modifiable areal unit problem (MAUP) [
35]. One way of reducing this bias would be the use of a regular grid [
8] which would, however, be more difficult to interpret visually.
Author Contributions
Conceptualization, S.S., C.Z. and D.A.; methodology, S.S. and C.Z.; software, S.S., C.Z. and D.A.; validation, C.Z.; formal analysis, S.S. and C.Z.; investigation, S.S. and C.Z.; resources, B.R.; writing—original draft, S.S. and C.Z.; writing—review & editing, S.S. and B.R.; visualization, S.S. and C.Z.; supervision, B.R.; project administration, B.R.; funding acquisition, B.R. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by the Austrian Research Promotion Agency (FFG) through the project MUSIG (Grant Number 886355) and the Austrian Science Fund through the project “Spatio-temporal Epidemiology of Emerging Viruses” (Grant Number I 5117).
Data Availability Statement
Not applicable.
Conflicts of Interest
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.
References
- STATISTA. Number of Monetizable Daily Active Twitter Users (mDAU) Worldwide from 1st Quarter 2017 to 2nd Quarter 2022. 2022. Available online: https://www.statista.com/statistics/970920/monetizable-daily-active-twitter-users-worldwide/ (accessed on 28 December 2022).
- STATISTA. Leading Countries Based on Number of Twitter Users as of January 2022. 2022. Available online: https://www.statista.com/statistics/242606/number-of-active-twitter-users-in-selected-countries/ (accessed on 28 December 2022).
- Wile, R. A Timeline of Elon Musk’s Takeover of Twitter. Available online: https://www.nbcnews.com/business/business-news/twitter-elon-musk-timeline-what-happened-so-far-rcna57532 (accessed on 28 December 2022).
- Conger, K.; Hirsch, L. Elon Musk Completes $44 Billion Deal to Own Twitter. 2022. Available online: https://www.nytimes.com/2022/10/27/technology/elon-musk-twitter-deal-complete.html (accessed on 28 December 2022).
- Zakrzewksi, C.; Siddiqui, F.; Menn. Musk’s ’Free Speech’ Agenda Dismantles Safety Work at Twitter, Insiders Say. 2022. Available online: https://www.washingtonpost.com/technology/2022/11/22/elon-musk-twitter-content-moderations/ (accessed on 16 January 2023).
- Mac, R.; Browning, K. Elon Musk Reinstates Trump’s Twitter Account. 2022. Available online: https://www.nytimes.com/2022/11/19/technology/trump-twitter-musk.html (accessed on 28 December 2022).
- Mac, R.; Mullin, B.; Conger, K.; Isaac, M. A Verifiable Mess: Twitter Users Create Havoc by Impersonating Brands. 2022. Available online: https://www.nytimes.com/2022/11/11/technology/twitter-blue-fake-accounts.html (accessed on 28 December 2022).
- Havas, C.; Resch, B. Portability of Semantic and Spatial-Temporal Machine Learning Methods to Analyse Social Media for near-Real-Time Disaster Monitoring. Nat. Hazards 2021, 108, 2939–2969. [Google Scholar] [CrossRef] [PubMed]
- Petutschnig, A.; Havas, C.R.; Resch, B.; Krieger, V.; Ferner, C. Exploratory Spatiotemporal Language Analysis of Geo-Social Network Data for Identifying Movements of Refugees. GI_Forum 2020, 1, 137–152. [Google Scholar] [CrossRef]
- Kogan, N.E.; Clemente, L.; Liautaud, P.; Kaashoek, J.; Link, N.B.; Nguyen, A.T.; Lu, F.S.; Huybers, P.; Resch, B.; Havas, C.; et al. An Early Warning Approach to Monitor COVID-19 Activity with Multiple Digital Traces in near Real Time. Sci. Adv. 2021, 7, eabd6989. [Google Scholar] [CrossRef] [PubMed]
- Liu, B. Sentiment Analysis: Mining Opinions, Sentiments, and Emotions, 2nd ed.; Studies in Natural Language Processing; Cambridge University Press: Cambridge, MA, USA; New York, NY, USA, 2020. [Google Scholar]
- Birjali, M.; Kasri, M.; Beni-Hssane, A. A Comprehensive Survey on Sentiment Analysis: Approaches, Challenges and Trends. Knowl.-Based Syst. 2021, 226, 107134. [Google Scholar] [CrossRef]
- Yue, L.; Chen, W.; Li, X.; Zuo, W.; Yin, M. A Survey of Sentiment Analysis in Social Media. Knowl. Inf. Syst. 2019, 60, 617–663. [Google Scholar] [CrossRef]
- Yadav, A.; Vishwakarma, D.K. Sentiment Analysis Using Deep Learning Architectures: A Review. Artif. Intell. Rev. 2020, 53, 4335–4385. [Google Scholar] [CrossRef]
- Chachra, A.; Mehndiratta, P.; Gupta, M. Sentiment Analysis of Text Using Deep Convolution Neural Networks. In Proceedings of the 2017 Tenth International Conference on Contemporary Computing (IC3), Noida, India, 10–12 April 2017; pp. 1–6. [Google Scholar] [CrossRef]
- Huang, Q.; Chen, R.; Zheng, X.; Dong, Z. Deep Sentiment Representation Based on CNN and LSTM. In Proceedings of the 2017 International Conference on Green Informatics (ICGI), Fuzhou, China, 15–17 April 2017; pp. 30–33. [Google Scholar] [CrossRef]
- Jin, Y.; Zhang, H.; Du, D. Improving Deep Belief Networks via Delta Rule for Sentiment Classification. In Proceedings of the 2016 IEEE 28th International Conference on Tools with Artificial Intelligence (ICTAI), San Jose, CA, USA, 6–8 November 2016; pp. 410–414. [Google Scholar] [CrossRef]
- Kovacs-Györi, A.; Ristea, A.; Havas, C.; Resch, B.; Cabrera-Barona, P. #London2012: Towards Citizen-Contributed Urban Planning through Sentiment Analysis of Twitter Data. Urban Plan. 2018, 3, 75–99. [Google Scholar] [CrossRef]
- Kovacs-Györi, A.; Ristea, A.; Kolcsar, R.; Resch, B.; Crivellari, A.; Blaschke, T. Beyond Spatial Proximity—Classifying Parks and Their Visitors in London Based on Spatiotemporal and Sentiment Analysis of Twitter Data. ISPRS Int. J. Geo-Inf. 2018, 7, 378. [Google Scholar] [CrossRef] [Green Version]
- Roberts, H.; Chapman, L.; Resch, B.; Sadler, J.; Zimmer, S.; Petutschnig, A. Investigating the Emotional Responses of Individuals to Urban Green Space Using Twitter Data: A Critical Comparison of Three Different Methods of Sentiment Analysis. Urban Plan. 2018, 3, 21–33. [Google Scholar] [CrossRef]
- del Gobbo, E.; Fontanella, L.; Fontanella, S.; Sarra, A. Geographies of Twitter Debates. J. Comput. Soc. Sci. 2021, 5, 647–663. [Google Scholar] [CrossRef]
- Kovács, T.; Kovács-Győri, A.; Resch, B. #AllforJan: How Twitter Users in Europe reacted to the murder of Ján Kuciak—Revealing spatiotemporal patterns through sentiment analysis and topic modeling. ISPRS Int. J. Geo-Inf. 2021, 10, 585. [Google Scholar] [CrossRef]
- Valle-Cruz, D.; Fernandez-Cortez, V.; López-Chau, A.; Sandoval-Almazán, R. Does Twitter Affect Stock Market Decisions? Financial Sentiment Analysis during Pandemics: A Comparative Study of the H1N1 and the COVID-19 Periods. Cogn. Comput. 2022, 14, 372–387. [Google Scholar] [CrossRef] [PubMed]
- Mendoza-Urdiales, R.A.; Núñez-Mora, J.A.; Santillán-Salgado, R.J.; Valencia-Herrera, H. Twitter Sentiment Analysis and Influence on Stock Performance Using Transfer Entropy and EGARCH Methods. Entropy 2022, 24, 874. [Google Scholar] [CrossRef] [PubMed]
- Kumari, S.; Pushphavathi, T.P. Intelligent Lead-Based Bidirectional Long Short Term Memory for COVID-19 Sentiment Analysis. Soc. Netw. Anal. Min. 2022, 13, 1. [Google Scholar] [CrossRef] [PubMed]
- Marcec, R.; Likic, R. Using Twitter for Sentiment Analysis towards AstraZeneca/Oxford, Pfizer/BioNTech and Moderna COVID-19 Vaccines. Postgrad. Med J. 2022, 98, 544–550. [Google Scholar] [CrossRef]
- Bernal, C.; Bernal, M.; Noguera, A.; Ponce, H.; Avalos-Gauna, E. Sentiment Analysis on Twitter about COVID-19 Vaccination in Mexico. In Advances in Soft Computing; MICAI 2021; Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2021; Volume 13068, pp. 96–107. [Google Scholar] [CrossRef]
- Putri Aprilia, N.; Pratiwi, D.; Barlianto, A. Sentiment Visualization of Covid-19 Vaccine Based on Naive Bayes Analysis. J. Inf. Technol. Comput. Sci. 2021, 6, 195–208. [Google Scholar] [CrossRef]
- Loureiro, D.; Barbieri, F.; Neves, L.; Anke, L.E.; Camacho-Collados, J. TimeLMs: Diachronic Language Models from Twitter. arXiv 2022, arXiv:cs/2202.03829. [Google Scholar]
- Barbieri, F.; Camacho-Collados, J.; Neves, L.; Espinosa-Anke, L. TweetEval: Unified Benchmark and Comparative Evaluation for Tweet Classification. arXiv 2020, arXiv:cs/2010.12421. [Google Scholar]
- Rosenthal, S.; Farra, N.; Nakov, P. SemEval-2017 Task 4: Sentiment Analysis in Twitter. arXiv 2019, arXiv:cs/1912.00741. [Google Scholar]
- Ord, J.K.; Getis, A. Local Spatial Autocorrelation Statistics: Distributional Issues and an Application. Geogr. Anal. 1995, 27, 286–306. [Google Scholar] [CrossRef]
- Steiger, E.; Westerholt, R.; Resch, B.; Zipf, A. Twitter as an Indicator for Whereabouts of People? Correlating Twitter with UK Census Data. Comput. Environ. Urban Syst. 2015, 54, 255–265. [Google Scholar] [CrossRef]
- Serere, H.N.; Resch, B.; Havas, C.R.; Petutschnig, A. Extracting and Geocoding Locations in Social Media Posts: A Comparative Analysis. GI_Forum 2021, 1, 167–173. [Google Scholar] [CrossRef]
- Buzzelli, M. Modifiable Areal Unit Problem. In International Encyclopedia of Human Geography; Elsevier: Amsterdam, The Netherlands, 2020; pp. 169–173. [Google Scholar] [CrossRef]
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).