Public Perception of Autonomous Mobility Using ML-Based Sentiment Analysis Over Social Media Data

.


Introduction
Autonomous transportation modes have been a popular field of research, as increasing the autonomy of vehicles has a vast impact on everyday life, as well as the way that a lot of current business is conducted. For example, the introduction of autonomous vehicles promises to completely change the way in which mobility and transportation logistics are dealt with. The introduction of a fully automated and autonomous transportation system can mimic the revolution of the early 20th century and its introduction of motorized vehicles that allowed the transportation of large loads in a faster and more efficient manner [1]. This promised revolution needs to overcome one major obstacle, beyond the technical barriers, to be established as an integral part of modern life, user acceptance.
User acceptance has been a key performance indicator in the development of novel transportation modes; however, the way that this is currently measured is mostly by using traditional methods such as focus groups and questionnaires. These methods, though effective, can only reach a limited number of people that are usually geographically adjacent. Moreover, the recruitment approaches in such focus groups present difficulties on their own [2,3]. To this end, a novel popular mode of measuring public opinion has been the analysis of social media data, which allows the integration of social theories with computational methods, as "Social Media Mining is the process of representing, analyzing, and extracting actionable patterns from social media data" [4]. This social media mining process can then be used in combination with deep machine learning (DML) techniques. DML uses robust statistical models and natural language processing techniques to assess the user's sentiments, identify the principal components of their opinion, and derive their "polarity" towards a studied subject.

Previous Work
As mentioned before, the acceptance of autonomous transportation has been a well-studied topic, with a plethora of previous assessments of user opinions. The research work presented in [5] studies the acceptance of autonomous public transportation modes using a series of interviews, while [6] studies on-demand business models of autonomous mobility. One of the most recent studies [7] also tries to identify gaps and suggest measures and best practices. However, the most objective outcomes of such analyses come from reaching as many people as possible and trying to include individuals with varying backgrounds, locations, and levels of interest in the user group. Social media mining and natural language processing offer an effective set of tools in this and have been used in a large number of applications such as stock market prediction [8], disaster relief [9], various stressful events [10], and autonomous cars [11]. Finally, even in the bounded field of social media natural language processing, there is a plethora of approaches that utilize various models, such as rule-based approaches [12], unsupervised methods [13], and attention models [14].
This work presents a procedure that aims to elicit the main fears and reservations of the general public towards autonomous modes of transportation. To achieve this, a sentiment analysis framework was used, based on the mining of relevant social media data. The strategy of mining social media data was selected as a means of targeting a large sum of opinions, and since social media mining allows a direct connection between the analyzer and the general public, supporting the breakdown of their raw opinion in a more objective way. This paper presents the structure of such a framework, the data capturing process, as well as the results of the sentiment analysis performed on the captured social media posts. We aim to (i) present the current acceptance levels of autonomous mobility and (ii) determine the main fears that reduce that level of acceptance.
The rest of this work is organized in the following way. Section 2 formulates the problem and presents the mathematical basis of our approach. Section 3 discusses the available data captured by two popular social media platforms, Reddit and Twitter. The results of our analysis are presented in Section 4. This paper ends with our conclusions in Section 5.

Relevant Previous Studies
There are a number of studies that analyze the current status of acceptance of autonomous mobility as well as identify current gaps and propose ways of moving forward.
Reference [5] presents an experiment that assess user acceptance in the case of autonomous public transportation modes. Even before the execution of the experiment, a number of passengers indicated that they had concerns about the safety of the used Autonomous Vehicle (AV) shuttle bus; however, after participating in the shuttle ride, these issues seemed resolved, and the main concern about safety was limited to the way the AV shuttle braked during the identification of an obstacle. Moreover, on-demand models and their acceptability were studied in [6]. There, the authors even make the distinction between acceptability, view of a product before its use, and acceptance, view of the product after the user has used it and familiarized themselves with it. The researchers used a mixed methods approach, complementing acceptance questionnaires with a psychological needsdriven approach using UX cards in an effort to understand underlying factors that influence acceptability and acceptance. Again, the main issue expressed was safety, and the trend of users accepting the safety of an AV after they used it, even if they expressed concerns beforehand, continued. [6] studies on-demand business models of autonomous mobility. One of the most recent studies [7] also tries to identify gaps and suggest measures and best practices. However, the most objective outcomes of such analyses come from reaching as many people as possible and trying to include individuals with varying backgrounds, locations, and levels of interest in the user group. Social media mining and natural language processing offer an effective set of tools in this that has been used in a large number of applications such as stock market prediction [8], disaster relief [9], various stressful events [10], and autonomous cars [11]. Specifically, [11] uses social media posts to mine the opinions towards fatalities with self-driving cars. This work focuses on the views of autonomous mobility before and after an accident that claimed the life of one pedestrian as described in [15]. To assess the changes in views, [11] analyzed the comments under self-driving related videos in YouTube. The results can be viewed in two different word clouds, presented in Figure 1a for before the crash and (b) for after.

Problem Formulation
Extracting information about the sentiments behind social media texts can be modelled as a classification problem, where we try to assort all the captured social media posts based on their overall sentiment intensity. Thus, we use an architecture that performs binary classification of a social media post, assigning it into one of two classes. These classes represent positively inclined opinions, i.e., posts that express positive feelings towards autonomous mobility, and negatively inclined opinions, i.e., posts that express negative feelings.
The identification of such sentiment values from social media posts is an arduous and complex task, the performance of which is bounded by the model's properties and the inherent limitations of the captured data. For this reason, we employ a tool based on a machine learning framework that receives the captured text as input and returns the sentiment assessment of this post in the form of a class label. It should be noted that this architecture uses a model specifically trained to classify the sentiment polarity of similar data. In this section, we present briefly the mathematical formulation behind our approach.
Let us denote as ( ) = [ , ] a 2 × 1 vector that contains the probabilities PP and PN that the nth post can be classified as expressing positive or negative sentiments (class P and N, respectively). Let us now assume that there is a non-linear function that relates probabilities ( ) with some measurable observations ( ), ( ), … ( ). In the following notation, we assume that ( ) are multidimensional tensors of the input data. Assuming a non-linear dependency of the classification output and the previous classification values, we derive a non-linear autoregressive model: where (•) refers to the non-linear relationship and q expresses the order of the model. Vector ( ) is an independent and identically distributed error. Equation (1) cannot be easily calculated, as (•) is unknown. The use of machine learning methods can produce an approximation of (•) in a way that minimizes the error ( ) . A feed forward neural network (FNN) can approximate such a relationship, given sufficient relevant training data. However, this FNN model fails at effectively selecting features of high-dimensional space and complex heterogeneous input data (i.e., social media text). Deep learning architectures, however, have proved especially accurate in extracting features [16][17][18]. For extracting representative features, the Bidirectional Encoder Representations from Transformers (BERT) model [14] has presented state-of-the-art results in a wide variety of natural language processing (NLP) tasks. This type of feature extraction can apply bi-directional training to the transformer, allowing a non-causal analysis. The term non-causal analysis means that the assessment uses both previous and future instances. This means that while we assess the emotional polarity of a word, we use both previous and future words appearing in the posts to optimize the classification. This allows our approach to capture long-range dependencies with the use of a bidirectional long short-term memory (LSTM) network. LSTM networks are of a similar structure to the bidirectional recurrent regression models, but each node in the hidden layer is replaced by a memory cell, instead of a single neuron. The architecture of the memory cell is presented in Figure 2. The basic unit of an LSTM is the memory cell. It consists of four components as shown in Figure  1, (i) the forget node, (ii) the input gate, (iii) the internal state, and (vi) the output gate. Each component non-linearly relates the inner product of the input vectors with appropriate weights, estimated via a training phase. The non-linear activation function adopted for the components is the sigmoid denoted as σ and the tanh. The forget gate throws out (forgets) information from the memory cell to model long-range dependence. The input node is the same as a hidden neuron, measuring the contribution of a hidden state to the final classification outcome. The internal gate decides if the respective hidden gate is "significant enough" for the classification. Finally, the output gate regulates whether the response of the current memory cell is significant enough to contribute to the next cell.
After the input layer receives the current data/post, we transform the input data so as to maximize the classification performance. These transformations are executed over the input data and a set of kernels, in order to select appropriate features using a specialized transformer encoder, presented in Figure 3a. BERT makes use of transformer, an attention mechanism that learns contextual relations between words (or sub-words) in a text. In its used form, transformer includes two separate mechanisms-an encoder that reads the text input and a decoder that produces a prediction for the task. Since BERT's goal is to generate a language model, only the encoder mechanism is necessary. An example of how this decoder works is presented in Figure 3b. In contrast to sequential models, which are often employed, BERT is considered bi-directional, as it analyses the entirety of a sentence at once. This results in the model learning the context of a word based on all its surroundings.
The kernel parameters are estimated in a way that minimizes the performance error on a groundtruth training set. The L feature maps, denoted as f1, f2, …, fL, are used as inputs in the final (classification) layer. The final component of the filter is the classification layer that receives the f1, f2, …, fL feature maps and triggers a supervised behavior classification.  The classification layer consists of r neurons, each stimulating a non-linear operation, where the sigmoid is neuron activation function. If we denote as , the weights that connect the i-th feature map with the j-th hidden neuron of the classification layer, then the output of this neuron is = (w • ), where is the aggregate feature map concatenating all features and w the aggregate weights for the j-th hidden neuron. Then, output is given as: where u includes all outputs over all the r hidden neurons and v the aggregate weights connecting the r hidden neurons of the classification layer with the output neuron. In Equation (2), ( ) expresses the input of the final output neuron before applying the activation function (•). In the previous notation, we assume that the classification output consists of one neuron. Extension to multiple neurons is straightforward. Subscript w in Equation (2) denotes the dependence of the classification on the network weights, which is estimated through a learning process. In our configuration, the proposed model consists of two output neurons. The classification output can be modelled in the following way: Equation (3) means that the nth post is assigned to the class with the maximum probability. It is obvious that in our framework + = 1. Beyond the assigned classification label, we use , as an additional qualitative metric.

Technical Architecture: Technical Workflow
All the aforementioned systems for the data capturing and the classification were implemented in Python using a number of relevant libraries such as PRAW (for Reddit data capturing), searchtweets-python (for data capturing on Twitter), and PyTorch (for the machine learning components). The machine learning systems were trained using the Stanford Sentiment Treebank [19], a widely used dataset for sentiment classification.
The overall architecture of the machine learning system is presented in Figure 4.

Targeted Social Media Platforms
Nowadays, there is a plethora of social media platform types, based on the type of the user, type of content, etc. For our analysis, we captured data from two main types of social media, social networks and discussion forums. The parameters that led to this selection were (i) data availability, (ii) ability to collect data in a way that adheres to all relevant national and EU regulations and laws relevant to data privacy, and (iii) the popularity of the platform in order to capture an as objective as possible image of the current sentiments of the general public towards automated mobility. Thus, this analysis was performed over two social media platforms, namely Twitter and Reddit.
Twitter is a microblogging social media service, where users express everyday events and opinions in a concise and size restrictive manner. As of 2018, Twitter has more than 321 million active online users and approximately 500 million individual posts per day [20]. Moreover, Twitter offers a comprehensive application programming interface (API) that allows users to query its content repositories, through specialized endpoints, facilitating specialized data capturing activities.
Reddit, similar to Twitter, is a social media service that offers discussion forum services to its users. Contrary to Twitter though, its content is organized in a thematic way, through a large number of individual bulletin board types of forums, named subreddits. As of July 2019, Reddit is the 13th most popular website globally, with more than 330 million active users [21]. Similar to Twitter, Reddit offers a dedicated API service that allows users to traverse its content. Moreover, due to the thematic nature of its content sorting, can be used to capture data only from specific subreddits that are thematically relevant to our endeavors.

Data Capturing Lexicon
There are several terms used to describe how automation and control will be incorporated into transportation modes in the future. Several modifiers are used interchangeably in the press and scientific literature, such as: connected and automated  connected and autonomous Thus, there was a need to create a lexicon of relevant terms that were used in our queries for data capturing social media data. The starting point of populating this lexicon was the European Commission's STRIA "Roadmap on Connected and Automated Transport" [22] as well as various academic publications [23][24][25]. These works led us to the construction of a lexicon of keywords that were in turn fed into the Twitter Data Streaming API to receive all relevant posts. As it was critical that we captured a more up-to-date picture of the public's view on our topics of interest, we limited the captured data from the 1st of May 2015 onwards. In the Reddit case, the lexicon was used to identify relevant subreddits that were queried through the corresponding Reddit API.
The list of terms used to query the aforementioned APIs is presented in Table 1. Internet of Vehicles ITS These terms were selected based on experiments that assessed the number of posts they captured as well as the relevance of the posts with the Drive2theFuture project.

Twitter
During the data capturing process, a total of 8143 tweets were captured and analyzed. These data were in turn filtered to keep the most significant posts. This filtering initially removed all hyperlinks and mentions to other users. Moreover, non-textual tokens of the post, e.g., emojis, were replaced with a relevant keyword. For example, the emoji was replaced by its keyword, i.e., (angryface). The resulting post was kept only if it complied with some qualitative metrics that made it possible to assess its sentiments. These metrics were (i) the length of the processed text must be at least 3 words and (ii) the resulting text must contain a verb. Essentially, tweets with the length of at least one full sentence were kept.
This initial processing step took place in order to maximize the performance of the sentiment analysis tool by filtering out unnecessary data. After this filtering, a total of 5047 tweets were then fed into the sentiment analysis tool. The overall steps of the workflow can be seen in Algorithm 1.

Reddit
The lexicon of Section 3.1 was used to identify relevant subreddits. This process identified five relevant subreddits, namely "Autonomous News", "Autonomous Cars", "Transport", "Self-Driving Cars", and "Autonomous Boats". Using the Reddit API, we captured and analyzed all the posts in these subreddits as well as their first level comments (note: Reddit comments are organized in a nested structure, where Level 1 comments are directed to the posts, Level 2 comments are directed to Level 1 comments, etc.).
A total of 576 posts and 228 comments were captured. The same prefiltering was performed in these posts and comments, as in the Twitter case. This resulted again in omitting posts and comments that were simply announcing innovations or new products or where the length of the text was too small to be considered. This resulted in 495 total posts and comments that were analyzed. The overall steps of the workflow can be seen in Algorithm 2.

Overall Scores
In this section, we present the results of the classification task. Each individual tweet, Reddit post, or Reddit comment was fed into the sentiment analysis tool, which returned a classification label (either positive or negative polarity of the post). Beyond the classification outcome we also used the classification confidence score (i.e., the probability of each score to be labelled in a specific class). It is worth mentioning here that normally, confidence score is not used in such a way. This is because high probability of assigning a post in a class depicts the confidence of the machine learning model to assign a specific label and not a more positive/negative opinion. However, here we can safely make the assumption that higher confidence score can be translated in a clearly more positive/negative opinion. This is due to the character limitations and the conciseness of a social media post. This assumption, however, is not valid when analyzing longer more elaborate texts (e.g., online articles).

Twitter
The Twitter posts expressed a positive opinion towards autonomous mobility 61.66% of the time; this can be seen in Figure 5.  Moreover, using the confidence scores we can visualize the overall distribution of the scores in a scatter diagram, as seen in Figure 6. This reveals that out of 5047 analyzed posts, 3342 expressed either highly positive or highly negative opinions (i.e., the probability of assigning them to the "positive opinion" class was respectively higher than 90% or lower than 10%). The breakdown of the confidence value of the all the captured tweets can be viewed in Figure 7.

Reddit
A similar breakdown of opinions can be viewed when analyzing Reddit posts, though in the case of Reddit there is a more positive view of automated mobility. A total of 71.72% of the posts expressed a positive opinion, as seen in Figure 8. The more positive demeanor expressed in the analysis of Reddit posts can be explained by the way that Reddit organizes its content in a thematic way (in contrast to the user-centric way that Twitter organizes each result). This means that people participate in the analyzed subreddits choose to do so and have a higher interest in autonomous mobility than normal. However, we still identified negative opinions from those users 28.28% of the time. Similar to the Twitter analysis, through the confidence scores we can visualize the overall distribution of the scores in a scatter diagram, as seen in Figure 9. This reveals that out of 576 analyzed posts/comments, 352 expressed either highly positive or highly negative opinions (i.e., the probability of assigning them to the "positive opinion" class was respectively higher than 90% or lower than 10%). The breakdown of the confidence value of the all the captured tweets can be viewed in Figure  10.

Discussion of Results
A simple enumeration of the analysis results is not considered enough for the mission of this study. Beyond the statistical breakdown of the general public (as expressed in social media posts), we need to identify user fears and specific aspects of autonomous mobility that can affect user acceptance. To this end, we proceeded in an analysis of the negative opinions captured, trying to determine the specific facets of autonomous mobility that drive these opinions.
Technophobia is probably one of the main issues. An overall 30% of the general public is estimated to be affected by technophobia [26]. Specifically, in the negative opinions, a higher frequency of mentions of cyber-security, robotics, and hacking, combined with safety concerns, was observed in 42.55% of posts labeled as negative.
Another driving force of negative opinions were employment issues. It was observed that a frequent mentioning of "professional" terms was expressed in negative views of the public. A total of 32.41% of negative posts mentioned such fears. This was particularly evident in the "Autonomous Boats" subreddit, where 100% of negative opinions mentioned reductions in personnel.
Finally, a number of other issues were observed, such as:  Fears due to the probable presence of both autonomous and conventional mobility solutions, i.e., users seem more fearful of a combination of autonomous and conventional traffic, such as the possibility of human error that an autonomous driving program could not anticipate;  Issues regarding the insurance of autonomous cars and liability in crashes;  Personal property and the possible extinction of driving as an everyday task/hobby.
To illustrate the principal components of the negative opinions, we created a word cloud that can be viewed in Figure 11.

Future Work
This work can be extended to a couple of areas. Firstly, there is a need to identify some automated translation processes in the data mining workflow that can enable the capture and analysis of social media posts from other languages. Moreover, it would be useful to assess the demographic profile of the user that expresses the opinion, thus enabling a correlation of the user's opinion and his sociological profile. However, such an analysis has to be very carefully planned because (i) there is no direct source of information that would enable us to capture such demographic data, and (ii) the analysis should adhere to privacy preserving laws and regulations. Finally, the performance of similar classification schemes, which could employ more than two classes of opinions, can possibly improve the granularity of the model.

Conclusions
The current paper presents a framework for capturing and analyzing social media posts using a sentiment analysis tool to determine the views of the general public towards autonomous mobility. The study contained the detailed description of the sentiment analysis tool and concept, as well as the data gathering that took place during the endeavors of this research activity. We targeted two very popular social media websites, i.e., Twitter and Reddit, and managed to capture a significant number of social media posts relevant to autonomous mobility.
After the data capturing procedure, we used a sentiment analysis system, developed using stateof-the-art machine learning models, to identify the sentiments behind the data captured and determine the level of acceptance for autonomous modes of mobility. Moreover, after the classification, we used the data that expressed "negative" opinions towards social mobility to identify key fears and reservations that drive those negative responses. Such an approach can be very beneficial to policy makers to capture society's perception of autonomous driving and vehicles, as it identifies the general fears of the public and thus defines targeted policies and activities to overcome these barriers. All the data capturing and analysis efforts followed a privacy preserving framework that adheres to all current regulations and ethical guidelines.