The Impact of Autonomous Vehicle Accidents on Public Sentiment: A Decadal Analysis of Twitter Discourse Using roBERTa

Sauvayre, Romy; Gable, Jessica S. M.; Aalah, Adam; Fernandes Novo, Melvin; Dehondt, Maxime; Chauvière, Cédric

doi:10.3390/technologies12120270

Open AccessArticle

The Impact of Autonomous Vehicle Accidents on Public Sentiment: A Decadal Analysis of Twitter Discourse Using roBERTa

by

Romy Sauvayre

^1,2,*

,

Jessica S. M. Gable

¹,

Adam Aalah

¹

,

Melvin Fernandes Novo

²

,

Maxime Dehondt

²

and

Cédric Chauvière

^2,3

¹

Laboratoire de Psychologie Sociale et Cognitive, Université Clermont Auvergne, Centre National de la Recherche Scientifique, 63000 Clermont-Ferrand, France

²

Clermont Auvergne INP, Centre National de la Recherche Scientifique, LAPSCO, Université Clermont Auvergne, 63000 Clermont-Ferrand, France

³

Laboratoire de Mathématiques Blaise Pascal, Université Clermont Auvergne, Centre National de la Recherche Scientifique, 63000 Clermont-Ferrand, France

^*

Author to whom correspondence should be addressed.

Technologies 2024, 12(12), 270; https://doi.org/10.3390/technologies12120270

Submission received: 30 October 2024 / Revised: 15 December 2024 / Accepted: 19 December 2024 / Published: 23 December 2024

(This article belongs to the Special Issue Advanced Autonomous Systems and Artificial Intelligence Stage)

Download

Browse Figures

Versions Notes

Abstract

:

In the field of autonomous vehicle (AV) acceptance and opinion studies, questionnaires are widely used. Additionally, AV experiments and driving simulations are utilized. However, few AV studies have investigated social media, and fewer studies have analyzed the impact of AV crashes on public opinion, often relying on limited social media datasets. This study aims to address this gap by exploring a comprehensive dataset of six million tweets posted over a decade (2012–2021), and neural networks, sentiment analysis and knowledge graphs are applied. The results reveal that tweets predominantly convey negative sentiment (40.86%) rather than positive (32.52%) or neutral (26.62%) sentiment. A binary segmentation algorithm was used to distinguish an initial positive sentiment period (January 2012–May 2016) followed by a negative period (June 2016–December 2021), which was initiated by a fatal Tesla accident and reinforced by a pedestrian killed by an Uber AV. The sentiment polarity exhibited in the posted tweets was statistically significant (U = 24,914,037,786; p value < 0.001). The timeline analysis revealed that the negative sentiment period was initiated by fatal accidents involving a Tesla AV driver and a pedestrian hit by an Uber AV, which was amplified by the mainstream media.

Keywords:

autonomous vehicle; accident; Twitter; social media; acceptability; natural language processing; sentiment analysis; text mining

1. Introduction

Social media has become a major communication tool for widely sharing opinions [1] across countries, transforming the Web into a dynamic information exchange arena [2]. For example, platforms such as Twitter (now known as “X” since 23 July 2023) have facilitated unity around important social issues. The #MeToo movement is a well-recognized example [3]. However, this widespread dissemination has raised concerns regarding the infodemic problem of false information sharing [4].

Social media has opened a potentially valuable Pandora’s box for the research community. Using deep learning and natural language processing, millions of tweets can be scrutinized and explained. For this particular task, sentiment analysis is widely employed in various research areas, such as research on vaccine hesitancy [5], monkeypox outbreaks [6], Asperger syndrome [7], and climate change [8].

In the field of autonomous vehicle (AV) acceptance and opinion studies, questionnaires are widely used methodologies, and include sociodemographic variables, which are used to define the level of AV acceptance across sex, income, and social class, for example [9,10,11,12]. Additionally, AV experiments [13,14] and driving simulations [15] have also been utilized. However, some rare AV studies have investigated social media platforms such as Twitter [1], TikTok [16], and YouTube [17]. Even fewer studies have analyzed the impact of AV crashes on public opinion [16,18,19], often relying on limited social media datasets. For example, Jefferson and McDonald [19] analyzed 11,164 tweets over a six-day period, whereas Jing et al. [16] scrutinized 16,635 comments extracted from TikTok and 9879 comments from Sina Weibo, a Chinese social media platform.

This study aims to address this gap by exploring a comprehensive dataset of six million tweets posted over a decade using sentiment and text content analysis tools. These datasets enable a thorough investigation of sentiment to comprehensively assess the impact of technological uncertainty, particularly when human lives are at risk.

2. Methods

The methodological stages of the study are summarized in Figure 1, and the corresponding explanations, from data collection to data analysis, are presented in the following subsections.

2.1. Data Collection

English messages posted on Twitter (tweets) discussing AVs between 1 January 2012 and 31 December 2021 were collected using API V2 Academic Research, utilizing pertinent keywords related to AVs (see Supplementary Material S1). A total of 6,773,504 tweets were subsequently extracted and saved in the MongoDB database.

2.2. Data Preprocessing

As the aim of this study is to analyze sentiment evolution on Twitter, the dataset was filtered using a specific emotion and valence dictionary (see Supplementary Material S2), and only tweets containing at least one emotional or valence word were retained. After removing duplicates, bot users and tweets with URLs, 3,379,636 messages were filtered to obtain a total of 509,069 tweets. A final cleaning was conducted to remove special characters (&gt, #, @, –, and |).

2.3. Data Labeling

Approximately 1200 tweets were randomly selected from the filtered final dataset and were assessed by two experienced individuals. Three labels were assigned to obtain a multiclass classification set (see Supplementary Material S3):

(0) “Negative” for tweets containing negative sentiments, such as distrust toward AVs or concerns about job losses;

(1) “Neutral” for tweets conveying neutral or informative content, such as commercial advertisements from car manufacturers, information about AV events, or informative statements without valency or an emotional perspective;

(2) “Positive” for tweets expressing favorable sentiments toward AVs, such as willingness to use them.

Notably, owing to the European General Data Protection Regulation (GDPR) [20], we are unable to provide examples of the tweets extracted from the dataset.

2.4. Artificial Neural Network Labeling

The pretrained “Twitter-roBERTa-base for Sentiment Analysis” model was selected and obtained from Huggingface [21] for training with multiclass labeling. This model was trained on approximately 58 million English tweets without URLs and fine-tuned specifically for sentiment analysis [22]. The manual labels were used for training, validation, and testing. The model achieved an accuracy of 0.69 and an F1-score of 0.69 for the test dataset. The 507,873 remaining unlabeled tweets were subsequently analyzed and automatically labeled using the trained RoBERTa model.

2.5. Data Analysis

Two types of analyses were conducted: (1) sentiment analysis with the automated labels obtained with the neural network model and (2) text mining analysis focusing on the content. To prepare the text, a dictionary of relevant words (see Supplementary Material S4), including misspellings, was compiled to filter the 13.53 million words (including 692,950 unique words). After rectifying misspellings and performing filtering, 3.04 million words (with 7699 unique words) were retained for text analysis and the construction of knowledge graphs. Note that mentions and hashtags were kept in the text as informative data for the purpose of the study. The knowledge graphs were created using Gephi 0.10 software. The two types of analyses were conducted using Jupyter Notebook and several Python libraries (Pandas, NumPy, Pyreadstat, Seaborn, Matplotlib, NLTK, re, Scikit-learn, Wordcloud, SpaCy, Stanza, SciPy, Ruptures, and NetworkX) for the text mining tasks (stemming, lemmatization, counting, vectorization, and text visualization), statistical analysis, text filtering, and graph construction.

3. Results

3.1. Tweet Count

The 507,873 analyzed tweets were posted by 299,468 unique users (i.e., tweeters) and contributed to 432,467 unique conversations. The most influential tweet received 45,927 likes and 8286 retweets. Over the studied decade (2012–2021), an average of 4232 tweets were posted per month, with a monthly range of 299 to 17,090 (Figure 2).

3.2. Sentiment Analysis

Most of the tweets from the studied decade were negative (40.86%) (Table 1). Negative tweets are associated with ‘problem’, ‘kill’, ‘death’, ‘crash’, ‘technology’, ‘Tesla’ or ‘Google’. Conversely, positive tweets are associated with ‘hope’, ‘wait’, ‘future’, ‘safe’, ‘love’, ‘great’ and ‘Tesla’.

A sentiment polarity analysis was conducted to evaluate the fluctuations in public sentiment toward AVs over time (Figure 3). The analysis employed a scale ranging from −1 for negative sentiment to +1 for positive sentiment. To identify significant shifts in public opinion, a binary segmentation algorithm was applied to the time-series data using the Ruptures Python library. The binary segmentation algorithm [23] is a multiple change point search method [24] that serves as an approximation with “an O (n log n) computational, where n is the number of data points” [25]. This algorithm operates by recursively partitioning the data into segments and identifying points where significant changes occur in the statistical properties of the data [24] (see Supplemental Material S5 for further methodological details). This methodological approach facilitated the detection of critical junctures in public sentiment regarding AV technology.

Two periods of sentiment are evident in the data: an initial positive period from 1 January 2012 to 22 June 2016, and a subsequent negative period from 23 June 2016 to 31 December 2021. Each period comprises two subperiods, labeled A and B, characterized by a decreasing average sentiment score (Figure 4).

During the first period, the monthly average sentiment polarity was positive (sentiment score = 0.046; N = 114,140 tweets). However, sentiment shifted toward a more negative trajectory during the second period, with the monthly average sentiment polarity decreasing to −0.121 (N = 393,733 tweets) (see Supplementary Material S6).

A Mann-Whitney U test indicated a statistically significant difference in sentiment polarity between the two periods (U = 24,914,037,786; p value < 0.001), and a Kruskal-Wallis H test indicated the same results for the four sentiment periods (H = 4755.07; p value < 0.001) (Figure 4).

3.3. Text Mining

To better understand the content exchanged and discussions among tweeters, a text mining analysis of the relevant selected words was conducted. The term frequency-inverse document frequency (TF-IDF) score of the most important terms in the filtered dataset indicated that different terms were predominant in the two periods (Figure 5). ‘Google’ was more likely to be mentioned in the first period than in the second period. Conversely, ‘Tesla’ was most cited in the second period (see Supplementary Material S7 and S8). The term ‘kill’ was the first negative word in the top 20 most popular terms in the second period.

Some words were common in both periods, such as ‘technology’, ‘traffic’, ‘life’, ‘accident’, ‘Uber’, and ’Google’. Others were specific to the first period, such as ‘love’, ‘hope’, ‘today’, and ‘interesting’. In contrast, the second period had new words such as ‘Tesla’, ‘Tesla autopilot’, ‘artificial intelligence’, ‘kill’, and ‘robot’. This TF-IDF analysis identifies the main words used during conversation about AVs and provides insight into how negative terms influence period 2. Tesla, prominent in the second period, records a polarity score of −0.18, whereas Google, also prominent in the first period, achieves a positive score of 0.02 (see Supplementary Material S6). Common terms across both periods, such as ‘technology’, are mentioned in positive tweets during the first period (0.07) and in negative tweets during the second period (−0.09).

3.4. Knowledge Graphs

To deeply explore the connections between these words, knowledge graphs [26] were constructed. The knowledge graph facilitates this objective by employing nodes, representing words, and edges, which denote the relationships between them, to effectively capture semantic associations. Tweets were tokenized into individual words, and edges were established between the words within each tweet. Figure 6 shows that the content varied across the two sentiment periods.

During the first positive period, the conversations revolved around the Google company, accompanied by optimism stemming from advancements in smart AV technology (turquoise class). AVs were anticipated to enhance road safety, facilitate smooth traffic flow (green class), and reduce accidents caused by human drivers (red class), ultimately saving lives (pink class).

In contrast, the second negative period was characterized by discussions focused on companies such as Tesla and Uber, as well as artificial intelligence embedded in AV technology (turquoise class). However, this optimism was overshadowed by concerns surrounding pedestrian and driver fatalities involving Tesla and Uber AVs (red class). Additionally, apprehensions regarding job losses resulting from the automation of human tasks were highlighted, as indicated in the smaller orange segment of the graph.

3.5. Turning Point Events

The discussion about AVs on Twitter initially began with a high level of positive sentiment reflecting hope regarding preserving lives on the road and smart technology development led by Google (Figure 7).

The first notable increase in tweet activity coincided with the approval of AV laws in Nevada and California in 2012, which occurred in February and September, respectively. A second wave of tweeting occurred in May 2014 when Google announced its plan to launch 200 AVs on the roads. Following this announcement, the average sentiment polarity decreased by approximately 80.65% (from 0.124 to 0.024). However, the most significant impact occurred after the first fatal accident involving the driver of a Tesla car equipped with an autopilot function in the United States on 7 May 2016. The decline in emotional polarity then reached 358.33% (from 0.024 to −0.062) during this first part of the negative period, which spans from 2016 to 2018. Finally, an accident involving a pedestrian killed by an Uber AV on 18 March 2018 generated a large surge of tweets, marking the onset of the most negative sentiment period of the decade regarding Twitter users’ opinions about AVs. The sentiment polarity reached −0.147, representing a decrease of 137.10% (from −0.062 to −0.147).

4. Discussion

This study explores the sentiment content of tweets related to AVs over the past decade (2012–2021). The findings reveal that tweets predominantly convey negative sentiment (40.86%) rather than positive (32.52%) or neutral sentiment (26.62%), which is consistent with previous research [18,27]. Sentiment polarity analysis was used to differentiate two sentiment periods: an initial positive period from January 2012 to May 2016 and a subsequent negative period from June 2016 to December 2021. The sentiment polarity exhibited in the posted tweets is statistically significant (U = 24,914,037,786; p value < 0.001).

A timeline analysis of events revealed that the negative sentiment subperiods were initiated by fatal accidents involving a Tesla AV driver and a pedestrian hit by an Uber AV. The sentiment analysis revealed that these specific fatal accidents had an impact on the opinions of Twitter users. However, previous AV accidents did not have such an effect on opinion. For example, 11 minor accidents in November 2015, the first Tesla accident in China in January 2016, and a Google car hitting a bus in February 2016, received less attention. When fatal American accidents were widely covered in the mainstream media, including The Guardian [28,29] or The New York Times [30,31], a wave of negative sentiment was evident in the Twitter dataset.

This study demonstrates that public opinion changes over time. Initially, when AV technology was perceived as a promising life-saving innovation bridging the gap between fiction and reality, sentiment polarity was positive. Conversely, when AVs began sharing the road with humans and were involved in rare accidents, the perceived risk erased hope, concerns about safety arose, and sentiment polarity shifted to negative. While Wicki’s study [32] investigating the effects of AV accidents on families indicates that the negative impact is transient, our study of social media suggests that the effect appears to be persistent and progressively negative over time.

The main result of this sentiment analysis is that negative publicity spread across mainstream media and social platforms can generate fear and reluctance toward technology, as history has shown [33,34]. As technology is increasingly released to the market before full development, some innovations may be slowed or halted due to malfunctions, particularly if they could cause fatalities.

5. Limitations

While our analysis considers various factors influencing sentiment, the dynamic nature of social media discourse may introduce biases that could impact the interpretation of sentiment trends. Furthermore, the variability in user engagement and the shifting landscape of public discourse surrounding autonomous vehicles may constrain the generalizability of our findings beyond the specific time frames examined.

Given that autonomous vehicle (AV) tweets contain a substantial amount of informative and promotional content, the textual data were filtered using emotion and valency dictionaries. Consequently, the analyzed content is not representative of all discussions, but rather focuses solely on the emotional aspects of the discourse. This limitation constrains the scope and generalizability of the conclusions drawn. Tweet sentiment analysis presents significant challenges due to the brevity of textual data (with a mean of 160 characters per tweet). This constraint may account for the F1 score of 0.69 achieved with the RoBERTa pretrained model. Despite these limitations, the model’s performance is comparable to or surpasses that of other studies in tweet sentiment analysis employing manual labeling and the RoBERTa model. For example, Trivyza [27] reported an F1 score of 0.46, whereas Benítez-Andrades et al. [35] reported an F1 score of 0.74. These comparable results across various studies underscore the persistent challenges in Twitter sentiment analysis, suggesting that the current model’s performance falls within an acceptable range for this specific domain. Nevertheless, the reliability of the conclusions warrants scrutiny. The analysis of a filtered large volume of data (509,069 tweets), coupled with binary segmentation and rigorous statistical analysis, facilitates the identification of reliable patterns and mitigates the impact of potential misclassifications. This approach suggests that false labeling may be considered mere noise in the dataset rather than a significant confounding factor.

6. Conclusions

The present study represents a significant advancement in understanding public acceptance and rejection of autonomous vehicle (AV) technology. Through the analysis of an extensive dataset comprising tweets posted over a decade (2012–2021), the results reveal a predominance of negative sentiments over positive or neutral expressions. However, the timeline analysis, conducted using binary segmentation techniques, indicates that public opinion initially leaned positive, highlighting the anticipated benefits of AV innovation. From 2016 onward, sentiment shifted toward negativity following the Tesla AV accidents reported by mainstream media. The perception of AV technology then transitioned from highly positive to predominantly negative as public concerns increasingly aligned with the technical challenges reported by the media. This trend also illustrates how traditional media can amplify legitimate public concerns and shape online sentiment toward AVs.

To further elucidate the impact of technological failures on public acceptance, additional research will explore the relationship between AV performance and individuals’ perceptions of reliability using qualitative interview data.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/technologies12120270/s1, S1: Twitter API Queries; S2: Emotion and Valency Dictionary; S3: Sample of 1196 Manually Labeled Tweets Sorted By Label Type; S4: Relevant Word Dictionary; S5: Binary Segmentation Methodology; S6: Sentiment Score and Number of Tweets for the Two Defined Periods; S7: Term Frequency-Inverse Document Frequency (TF-IDF) Scores and Polarity Sentiment Scores for the Top 20 Words in the Two Defined Periods; S8: Trends in Sentiment and Tweet Volume for Mentions of Google and Tesla. Ref. [36] is cited in Supplementary Materials.

Author Contributions

R.S.: writing of the main manuscript and Supplementary Material, data collection, data labeling, data analysis, figure preparation, research design, and project management; J.S.M.G.: data labeling, dictionary building, and data analysis; A.A.: model coding and data analysis; M.F.N.: model coding; M.D.: model coding; C.C.: model coding and project management. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Engineering School Polytech Clermont and the Délégation à la Sécurité Routière (n°2201.305.286).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All data are provided in the main article and the Supplementary Materials. Owing to the European General Data Protection Regulation (GDPR) [20], the original tweets cannott be shared.

Conflicts of Interest

The authors declare that they have no conflicts of interest. The funders had no role in the design of the study, in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

References

Ding, Y.; Korolov, R.; Wallace, W.A.; Wang, X.C. How are sentiments on autonomous vehicles influenced? An analysis using Twitter feeds. Transp. Res. Part C Emerg. Technol. 2021, 131, 103356. [Google Scholar] [CrossRef]
Giachanou, A.; Crestani, F. Like It or Not: A Survey of Twitter Sentiment Analysis Methods. ACM Comput. Surv. 2016, 49, 1–41. [Google Scholar] [CrossRef]
Modrek, S.; Chakalov, B. The #MeToo Movement in the United States: Text Analysis of Early Twitter Conversations. J. Med. Internet Res. 2019, 21, e13837. [Google Scholar] [CrossRef] [PubMed]
Zarocostas, J. How to fight an infodemic. Lancet 2020, 395, 676. [Google Scholar] [CrossRef] [PubMed]
Gable, J.S.M.; Sauvayre, R.; Chauvière, C. Fight Against the Mandatory COVID-19 Immunity Passport on Twitter: Natural Language Processing Study. J. Med. Internet Res. 2023, 25, e49435. [Google Scholar] [CrossRef] [PubMed]
Bengesi, S.; Oladunni, T.; Olusegun, R.; Audu, H. A Machine Learning-Sentiment Analysis on Monkeypox Outbreak: An Extensive Dataset to Show the Polarity of Public Opinion From Twitter Tweets. IEEE Access 2023, 11, 11811–11826. [Google Scholar] [CrossRef]
Gabarron, E.; Dechsling, A.; Skafle, I.; Nordahl-Hansen, A. Discussions of Asperger Syndrome on Social Media: Content and Sentiment Analysis on Twitter. JMIR Form. Res. 2022, 6, e32752. [Google Scholar] [CrossRef]
Mohamad Sham, N.; Mohamed, A. Climate Change Sentiment Analysis Using Lexicon, Machine Learning and Hybrid Approaches. Sustainability 2022, 14, 4723. [Google Scholar] [CrossRef]
Zhang, Q.; Zhang, T.; Ma, L. Human acceptance of autonomous vehicles: Research status and prospects. Int. J. Ind. Ergon. 2023, 95, 103458. [Google Scholar] [CrossRef]
Bala, H.; Anowar, S.; Chng, S.; Cheah, L. Review of studies on public acceptability and acceptance of shared autonomous mobility services: Past, present and future. Transp. Rev. 2023, 43, 970–996. [Google Scholar] [CrossRef]
Golbabaei, F.; Yigitcanlar, T.; Paz, A.; Bunker, J. Individual Predictors of Autonomous Vehicle Public Acceptance and Intention to Use: A Systematic Review of the Literature. J. Open Innov. Technol. Mark. Complex. 2020, 6, 106. [Google Scholar] [CrossRef]
Hegner, S.M.; Beldad, A.D.; Brunswick, G.J. In Automatic We Trust: Investigating the Impact of Trust, Control, Personality Characteristics, and Extrinsic and Intrinsic Motivations on the Acceptance of Autonomous Vehicles. Int. J. Hum. Comput. Interact. 2019, 35, 1769–1780. [Google Scholar] [CrossRef]
Xu, Z.; Zhang, K.; Min, H.; Wang, Z.; Zhao, X.; Liu, P. What drives people to accept automated vehicles? Findings from a field experiment. Transp. Res. Part. C Emerg. Technol. 2018, 95, 320–334. [Google Scholar] [CrossRef]
Chaufrein, M.; Forte, C.; Colom, M.; Delage, L.; Ouafi, H.; Saran, R.; Sidane, Y.; Vieira, R.L.; Milanes, V.; Salomon, S. Tornado_Attentes et Acceptabilité Utilisateurs de VAC Expérimentaux. France. 2021. Available online: https://eexposit.perso.univ-pau.fr/tornado/downloads/L8%20Tornado%20Analyse%20d%27acceptabilite%20et%20rapport%20final%20lot%208%20et%20lot%207.pdf (accessed on 9 January 2022).
Zou, X.; O’Hern, S.; Ens, B.; Coxon, S.; Mater, P.; Chow, R.; Neylan, M.; Vu, H.L. On-road virtual reality autonomous vehicle (VRAV) simulator: An empirical study on user experience. Transp. Res. Part C Emerg. Technol. 2021, 126, 103090. [Google Scholar] [CrossRef]
Jing, P.; Wang, B.; Cai, Y.; Wang, B.; Huang, J.; Yang, C.; Jiang, C. What is the public really concerned about the AV crash? Insights from a combined analysis of social media and questionnaire survey. Technol. Forecast. Soc. Social Change 2023, 189, 122371. [Google Scholar] [CrossRef]
Das, S.; Dutta, A.; Lindheimer, T.; Jalayer, M.; Elgart, Z. YouTube as a Source of Information in Understanding Autonomous Vehicle Consumers: Natural Language Processing Study. Transp. Res. Rec. 2019, 2673, 242–253. [Google Scholar] [CrossRef]
Othman, K. Public attitude towards autonomous vehicles before and after crashes: A detailed analysis based on the demographic characteristics. Cogent Eng. 2023, 10, 2156063. [Google Scholar] [CrossRef]
Jefferson, J.; McDonald, A.D. The autonomous vehicle social network: Analyzing tweets after a recent Tesla autopilot crash. Proc. Human Factors Ergon. Soc. Annu. Meet. 2019, 63, 2071–2075. [Google Scholar] [CrossRef]
GDPR Twitter. Twitter Controller-to-Controller (Outbound) Data Protection Addendum. Available online: https://gdpr.twitter.com/en/controller-to-controller-transfers.html (accessed on 25 April 2022).
Hugging Face. Twitter-roBERTa-Base for Sentiment Analysis. Available online: https://huggingface.co/cardiffnlp/twitter-roberta-base-sentiment (accessed on 25 September 2023).
Barbieri, F.; Camacho-Collados, J.; Espinosa Anke, L.; Neves, L. TweetEval: Unified Benchmark and Comparative Evaluation for Tweet Classification. In Findings of the Association for Computational Linguistics: EMNLP 2020; Cohn, T., He, Y., Liu, Y., Eds.; Association for Computational Linguistics: Stroudsburg, PA, USA, 2020; pp. 1644–1650. [Google Scholar] [CrossRef]
Scott, A.J.; Knott, M. A Cluster Analysis Method for Grouping Means in the Analysis of Variance. Biometrics 1974, 30, 507–512. [Google Scholar] [CrossRef]
Fryzlewicz, P. Wild Binary Segmentation for Multiple Change-Point Detection. Ann. Stat. 2014, 42, 2243–2281. [Google Scholar] [CrossRef]
Killick, R.; Fearnhead, P.; Eckley, I.A. Optimal Detection of Changepoints with a Linear Computational Cost. J. Am. Stat. Assoc. 2012, 107, 1590–1598. [Google Scholar] [CrossRef]
Wang, Q.; Mao, Z.; Wang, B.; Guo, L. Knowledge Graph Embedding: A Survey of Approaches and Applications. IEEE Trans. Knowl. Data Eng. 2017, 29, 2724–2743. [Google Scholar] [CrossRef]
Trivyza, M.-F. Autonomous Vehicles: Multi-Class Twitter Sentiment Analysis. Ph.D. Dissertation, National Technical University of Athens, Athens, Greece, 2021. Available online: https://dspace.lib.ntua.gr/xmlui/handle/123456789/54013?locale-attribute=en (accessed on 8 December 2023).
Levin, S.; Woolf, N.; The Guardian. Tesla Driver Killed While Using Autopilot was Watching Harry Potter, Witness Says. 2016. Available online: https://www.theguardian.com/technology/2016/jul/01/tesla-driver-killed-autopilot-self-driving-car-harry-potter (accessed on 20 April 2024).
Levin, S.; The Guardian. “Uber Should Be Shut Down”: Friends of Self-Driving Car Crash Victim Seek Justice. 2018. Available online: https://www.theguardian.com/technology/2018/mar/20/uber-self-driving-car-crash-death-arizona-elaine-herzberg (accessed on 20 April 2024).
Vlasic, B.; Boudette, N.E.; The New York Times. Self-Driving Tesla Was Involved in Fatal Crash, U.S. Says. 2016. Available online: https://www.nytimes.com/2016/07/01/business/self-driving-tesla-fatal-crash-investigation.html (accessed on 20 April 2024).
Griggs, T.; Wakabayashi, D.; The New York Times. How a Self-Driving Uber Killed a Pedestrian in Arizona. 2018. Available online: https://www.nytimes.com/interactive/2018/03/20/us/self-driving-uber-pedestrian-killed.html (accessed on 20 April 2024).
Wicki, M. How do familiarity and fatal accidents affect acceptance of self-driving vehicles? Transp. Res. Part F Traffic Psychol. Behav. 2021, 83, 401–423. [Google Scholar] [CrossRef]
Rogers, E.M. Diffusion of Innovations, 5th ed.; Free Press: New York, NY, USA, 2003. [Google Scholar]
Juma, C. Innovation and Its Enemies: Why People Resist New Technologies; Oxford University Press: New York, NY, USA, 2016. [Google Scholar]
Benítez-Andrades, J.A.; Alija-Pérez, J.-M.; Vidal, M.-E.; Pastor-Vargas, R.; García-Ordás, M.T. Traditional Machine Learning Models and Bidirectional Encoder Representations from Transformer (BERT)–Based Automatic Classification of Tweets About Eating Disorders: Algorithm Development and Validation Study. JMIR Med. Inform. 2022, 10, e34492. [Google Scholar] [CrossRef]
Hastie, T.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning; Springer: New York, NY, USA, 2009. [Google Scholar] [CrossRef]

Figure 1. Flowchart of the methodological stages.

Figure 2. Monthly tweet counts from 1 January 2012 to 31 December 2021 (N = 507,873).

Figure 3. Daily average sentiment polarity score per tweet and monthly moving average from January 2012 to December 2021, with change point dates identified by the binary segmentation algorithm (binseg), with the l2-norm penalty model (model = “l2”).

Figure 4. Average polarity scores and p values as a function of the sentiment period.

Figure 5. Top 20 terms according to the TF-IDF score and sentiment polarity score by period (period 1: positive—2012–2016; period 2: negative—2016–2021).

Figure 6. Knowledge graphs for sentiment periods (period 1: positive—2012–2016; period 2: negative—2016–2021). Visualization in Gephi using the Force Atlas 2 algorithm; the node size corresponds to the degree of weighting.

Figure 7. Daily AV tweet timelines per sentiment period, annotated with events corresponding to the days with the most tweets.

Table 1. Sentiment of analyzed tweets.

Sentiment	Number of Tweets	%
Negative	207,508	40.86%
Positive	165,157	32.52%
Neutral	135,208	26.62%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Sauvayre, R.; Gable, J.S.M.; Aalah, A.; Fernandes Novo, M.; Dehondt, M.; Chauvière, C. The Impact of Autonomous Vehicle Accidents on Public Sentiment: A Decadal Analysis of Twitter Discourse Using roBERTa. Technologies 2024, 12, 270. https://doi.org/10.3390/technologies12120270

AMA Style

Sauvayre R, Gable JSM, Aalah A, Fernandes Novo M, Dehondt M, Chauvière C. The Impact of Autonomous Vehicle Accidents on Public Sentiment: A Decadal Analysis of Twitter Discourse Using roBERTa. Technologies. 2024; 12(12):270. https://doi.org/10.3390/technologies12120270

Chicago/Turabian Style

Sauvayre, Romy, Jessica S. M. Gable, Adam Aalah, Melvin Fernandes Novo, Maxime Dehondt, and Cédric Chauvière. 2024. "The Impact of Autonomous Vehicle Accidents on Public Sentiment: A Decadal Analysis of Twitter Discourse Using roBERTa" Technologies 12, no. 12: 270. https://doi.org/10.3390/technologies12120270

APA Style

Sauvayre, R., Gable, J. S. M., Aalah, A., Fernandes Novo, M., Dehondt, M., & Chauvière, C. (2024). The Impact of Autonomous Vehicle Accidents on Public Sentiment: A Decadal Analysis of Twitter Discourse Using roBERTa. Technologies, 12(12), 270. https://doi.org/10.3390/technologies12120270

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

The Impact of Autonomous Vehicle Accidents on Public Sentiment: A Decadal Analysis of Twitter Discourse Using roBERTa

Abstract

1. Introduction

2. Methods

2.1. Data Collection

2.2. Data Preprocessing

2.3. Data Labeling

2.4. Artificial Neural Network Labeling

2.5. Data Analysis

3. Results

3.1. Tweet Count

3.2. Sentiment Analysis

3.3. Text Mining

3.4. Knowledge Graphs

3.5. Turning Point Events

4. Discussion

5. Limitations

6. Conclusions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI