Sensing Urban Transportation Events from Multi-Channel Social Signals with the Word2vec Fusion Model

Social sensors perceive the real world through social media and online web services, which have the advantages of low cost and large coverage over traditional physical sensors. In intelligent transportation researches, sensing and analyzing such social signals provide a new path to monitor, control and optimize transportation systems. However, current research is largely focused on using single channel online social signals to extract and sense traffic information. Clearly, sensing and exploiting multi-channel social signals could effectively provide deeper understanding of traffic incidents. In this paper, we utilize cross-platform online data, i.e., Sina Weibo and News, as multi-channel social signals, then we propose a word2vec-based event fusion (WBEF) model for sensing, detecting, representing, linking and fusing urban traffic incidents. Thus, each traffic incident can be comprehensively described from multiple aspects, and finally the whole picture of unban traffic events can be obtained and visualized. The proposed WBEF architecture was trained by about 1.15 million multi-channel online data from Qingdao (a coastal city in China), and the experiments show our method surpasses the baseline model, achieving an 88.1% F1 score in urban traffic incident detection. The model also demonstrates its effectiveness in the open scenario test.


Introduction
Intelligent transportation systems (ITS) are highly involved in improving transportation efficiency and services [1,2]. Successful operation of ITS relies on multi-modal big data. To collect traffic data, physical sensors like inductive loops, radars, and cameras, are deployed in real world transportation systems [3,4]. However, such conventional physical sensors are expensive and provide limited coverage of transportation networks.
Today, social media and online web services provide a new path to access traffic information through the internet. Due to the widespread applications of smart devices and social networks, people can create and diffuse user generated contents (UGC) anywhere at any time. Therefore, everyone in such a network can be a social sensor to perceive the real world, which makes the collection of social signals in multiple domains possible [5,6]. Compared with traditional physical traffic sensors, social traffic sensors offer advantages of costing essentially nothing and large-scale coverage [7]. Social media data have been applied to detect traffic events, explain traffic status, analyze traffic sentiment, etc. [8][9][10].
at both temporal and spatial levels, which showed there are indicative group mobility patterns and behavioral characteristics in urban transportation. Zeng et al. [11] studied online transportation-related topic features on the national holiday of China, from the perspectives of topic evolution analysis, opinion analysis, and geographic analysis, which could potentially help administrative sectors for traffic management.
Traffic event sensing and detection is one of the major tasks in social mediabased transportation research and applications, and many scholars have proposed inspiring models. D'Andrea, et al. [13] treated traffic event detection from Twitter as a binary classification task, which assigns traffic/non-traffic class labels to each tweet. They compared seven different classification models, such as SVM, NB, C4.5, KNN, and employed the SVM model in their proposed system since SVM achieved the best accuracy value in their tests. Gu et al. [28] defined five traffic incident categories and extracted traffic incident information on highways and arterial roads from tweet texts and they firstly used the Semi-Naive-Bayes classifier to categorize traffic incident tweets and non-traffic incident tweets, and then they trained the Supervised Latent Dirichlet Allocation classifier to identify traffic incident categories. Fu et al. [29] proposed an association rule-based keyword generation scheme to iteratively extract real time transportation incidents. Meanwhile they implemented LexRank algorithms on the complete sentence graph to rank the most influencing node, and the ranked words are regarded as the summarization of traffic incidents. Gutierrez et al. [30] presented a computational framework to detect real-time traffic events in the UK from Twitter, including tweet filtering, event type classification, name entity recognition, geo-location extraction and event tracking. Nguyen et al. [15] built a system named TrafficWatch that leveraged twitter signals and integrated them with online clustering and classification algorithms for traffic monitoring and event detection. TrafficWatch demonstrated the potential to report traffic incidents earlier than other data sources when deployed in the traffic management center of Australia. Hao et al. [31], mined the correlation between adverse weather topic heat and traffic incidents in social media, and further proposed traffic situation awareness and alerting model assisted by adverse weather data to provide information on city-level traffic situations.

Topic Modeling
Topic modeling tries to uncover the hidden semantic structures of different types of documents. Topic modeling technologies can obtain traffic topics from online social media and news texts. In recent years, Latent Dirichlet Allocation (LDA) [32] and its extension models have become dominant for topic modeling. Zhai et al. [33] proposed an online topic model by extending LDA to draw topics from a Dirichlet process whose base distribution is over all possible words rather than from a finite Dirichlet distribution. They also develop an online variational inference method to heuristically expand the set of words in vocabulary. Paisley et al. [34] proposed a nested hierarchical Dirichlet process for hierarchical topic modeling, and developed a stochastic variational inference algorithm for the model. The proposed method was tested on 1.8 million documents from The New York Times and 2.7 million documents from Wikipedia.
LDA and related models are traditionally applied to long text documents like news articles, while there are increasing needs for modeling topics in short text documents like Twitter posts. Quan et al. [35] integrated topic modeling with automatic short text aggregation to alleviate the sparsity problem in short and sparse texts. Their experiments indicate the proposed scheme can extract more meaningful and interpretable topics than traditional topic models. Ramage et al. [36] developed a partially supervised LDA model which labels the content of twitter posts with four characteristics regarding substance, style, status and social relationship. The experiments indicated weighted combination of L-LDA model's latent topic features and TF-IDF feature achieve satisfactory results for topic ranking and recommendation task. Zhao et al. [37] assumed that each tweet has been associated with a Twitter topic, and each user has its own topic distribution. Based on such a hypothesis, they proposed a twitter LDA generation process, and used Gibbs sampling to perform model inference. Experiments demonstrated the Twitter-LDA can get more meaningful topic words than standard LDA models.

Cross-Platform Event Detection
Multi-channel social signals processing benefits from the characteristics of multi-modality and multi-domain, which integrate different kinds of information from different sources to obtain a more comprehensive view ("big picture") of objects compared to a single data stream. Many models with cross-platform data can produce better event prediction and detection results. Hou et al. [38] developed a cross-dependence temporal topic model to extract topics, and studied the mutual influence between news and user-generated content streams. The proposed methods were evaluated on five datasets from Sina, The New York Times and Twitter. Oghina et al. [39] used tweets from Twitter and comments from YouTube to predict IMDb movie ratings, and the best performance model could rate movies close to the observed values. Bao et al. [40] used a co-clustering model to detect emerging topics from The New York Times and Flickr which experimentally achieved effective evaluation results. Daichi et al. [41] applied a time series topic detection model to mix news and twitter streams during the London Olympic games, which detected 34 topics with a precision of 87.5%.
Some cross-platform event detection architectures and systems have been proposed. Qian et al. [42] proposed a generic framework for social event detection, tracking and evolution analysis. They developed specific models for each task. For example, they used a boosted multi-modal supervised LDA model for social event detection, and applied an incremental topic model learning algorithm for analyzing the evolutionary processes of social events. Li et al. [43] presented the Event Knowledge from News and Opinions in Twitter (EKNOT) system which could extract summaries combining an objective description from news and opinions from tweets. They used an entity graph to link entities for an event, also used an opinion graph to get a joint summarization of an event. Wang et al. [44] proposed an event-based multi-aspect reflection mining framework to discover, link and present major events. News and tweets about a major event can complement each other to describe the event.

Methodology
Social signals effectively connect the physical space and cyber space, which provides a new paradigm for traffic situation awareness. The main challenges that need to be faced are how to collect, process, analyze, and fuse several types of signals in social transportation. As mentioned in the related works, social transportation has attracted increasing numbers of researchers, and some frameworks/architectures were proposed for specific social transportation tasks, such as TrafficWatch [15] for traffic incident extraction, STAR-CITY [2] for traffic flow analysis and urban planning, Steds [29] for traffic event summarization, Hao's framework [31] for weather-related traffic incidents perception. However, these architectures only leverage single social signals either from news or social media.
Meanwhile, the frameworks for multi-modal data sensing and fusion are also getting more and more attention, with representative event detection frameworks like EKNOT [43], and the work of Wang et al. [44] and Qian et al. [42], all of which combine objective descriptions from news and opinions from tweets together, linking the event descriptions and reflection from a cross-platform with the entity graphs, finally fused into a joint summary of events. Although these frameworks recognized that fused multi-channel social signals will improve the accuracy of event detection and the diversity of event description, the entities (words representing person, location, organization, etc.) network-based event fusion methods are unable to deeply fuse the event description in a semantic way. Moreover, it is very necessary to integrate the traffic domain knowledge when applying the frameworks above to intelligent transportation systems.
To our knowledge, this is the first attempt to transfer the traffic event detection task from single channel social signals to multi-channel social signals, the main challenge being how to sense, process, analyze, link and fuse several types of signals in social transportation. To address these problems, it is necessary to integrate natural language processing, information retrieval and machine learning methods together with transportation domain knowledge to utilize the architecture.
In this paper, by sufficiently utilizing the objectivity of news media and immediacy of social media, a word2vec-based event fusion (WBEF) model is proposed for the urban transportation event detection, which extracts topics from multi-channel social signals, semantically coupled and fused topics into cross-platform urban traffic events description. Furthermore, we develop a cross-platform traffic event detection system integrating the above methods for real world applications.
The system architecture is shown in Figure 1. We choose news articles and Weibo posts as our cross-platform data sources. The multi-channel social signals from cross-platform online data are sensed through a keyword-based social sensor network configured by domain experts. The sensed webpages are filtered and decomposed into structured data, then aggregated into data blocks assigned to city roads. Traffic event topics from news articles are extracted with a news LDA model, and traffic event topics from Weibo posts are extracted with a Weibo specific model named w-LDA model. Furthermore, topic words are transformed into semantic representation with word vectors, and then the cross-platform traffic events can be fused based on the topic distance matrix semantically. The expressions and descriptions of variables in the following algorithms or sub-models are summarized in Appendix A.

Social Sensors Network
The social signals can be typically sensed by two approaches. The first approach is using API services provided by the platform to parse the XML or JSON file, such as Twitter and Weibo. The second approach is by deploying a web crawler that periodically monitors keywords, accounts and URL lists. In our system, we take both approaches according to the availability and limits of service providers. By consulting experts in traffic administrative agencies, the traffic keywords are grouped into four categories which are traffic event keywords, urban identity keywords, road identity keywords and domain assistant identity keywords, respectively. Specifically, traffic event keywords are mainly exploited to describe three types of traffic events, which are traffic accidents, traffic jams and traffic suggestions. The identity keywords ( Figure 2) describe corresponding cities, roads, and domain assistants. All above traffic keywords are treated as search seeds that are feeding into the social sensor network. The News sensor perceived the latest news data according one specific keyword from the News search engine. Correspondingly, the Weibo sensor also perceived latest Weibo data according one specific keyword from the Weibo search engine. The News and Weibo sensors were deployed to continuously monitor the data according to a keywords list, and the links of different sensors were built if both sensors acquired the same News or Weibo article. Then, all the sensors and corresponding links eventually formed a social sensors network, which dynamically adapted to perceiving multi-channel social signals.
The keywords-based social sensors network consists of a crawler, page parser, duplicated URLs filter, etc. [45]. Particularly, to efficiently utilize the network bandwidth and computation resources, the social sensors network has a keywords priority adaptor, which dynamically sorts the keyword query priorities according to the value of importance ranking for each node in the keyword network.
The keywords-based social sensors network is visualized in Figure 3. The nodes denote traffic keywords. The size of each nodes represents the number of webpages that are collected regarding the node's keywords. The edges are constructed by calculating the co-occurrence of keyword pairs in the same document. All the search keywords are aggregated into network clusters, the main clusters include the roads cluster, the traffic incidents cluster and traffic suggestion keywords. Since the experiment is based on a Chinese corpus, here we only show the social sensors network with Chinese nodes and annotate the important information in English. The social sensors network can also be analyzed by a social network algorithm, such as ranking the betweenness, closeness and degree.

Data Preprocessing
Data preprocessing is a fundamental step for traffic event sensing from cross-platform media because raw data are unstructured and full of noise. The major preprocessing steps include meta-data extraction, noise filtering, word segmentation and data blocks aggregation: Noise filtering: The URL links, paragraph marks, emoji, etc. in the texts are regarded as noise information, which negatively influence the accuracy and efficiency of word segmentation, information processing and model training. To solve the problems, we define regular expressions to represent various noise patterns, and then use the regex to search and delete noises in the text. Meanwhile, either too short or too long texts are removed.
Meta-data extraction: we deployed Dom wrapper and XPath parsers to extract the title, post time and other meta-data like authors, review number and repost number, etc. from the sensed web page cache. Then the meta-data information is stored in databases.
Word segmentation: The space is a natural word delimiter in English texts, however, there is no such equivalent in Chinese, so word segmentation is needed for Chinese NLP tasks. The Language Technology Platform (LTP) [46] was deploy to segment words, remove punctuation and stop words, and tagging the road entity with customized dictionary. The customized dictionary for LTP includes all the keywords we used for the social sensors.
Data blocks aggregation: The sensed multi-channel social signals are aggregated into data blocks, which are defined as a dataset containing cross-platform online data related to every urban road. With the LTP's entity recognition tools, the road entity of every News article or Weibo post is extracted, then the cross-platform data are aggregated into data blocks corresponding to the road entity. Moreover, when news articles or Weibo posts contain multiple road entities they will be assigned to each corresponding data block separately. Each data block was fed into the event detection and fusion models iteratively.

Word2vec Based Event Fusion Model
News articles usually describe a traffic event with relatively standard language, while Twitter and Weibo may have posts expressing opinions/comments/discussions on the same traffic event [47,48]. Matching news articles and Twitter/Weibo posts on the same transportation topic can give a more comprehensive description of a traffic event. Consequently, we propose to use LDA-based models to extract topics in each respective source channels and link them together with the WBEF model.

Transportation Events Detection from News and Weibo
As described in the literature review section, the LDA and its variants have been widely applied for event detection. To group transportation topics and find topic words from News, we use the standard LDA model for event detection. While for Weibo, we detect traffic events with a w-LDA model which is described in details as follows: w-LDA model: The w-LDA model is based on the USER scheme which achieves good performances in Twitter classification [49]. The process is described as follows: a) Combine all training messages generated by the same user into user profiles; b) Train the w-LDA model with training user profiles; c) Aggregate all testing messages generated by the same user into testing user profiles; d) Use the trained w-LDA model to infer a topic mixture.
The aggregated user profiles can be viewed as a random mixture distribution over latent topics, where each topic is characterized by a distribution over words. Both distributions are assumed to have a sparse Dirichlet prior. Suppose the corpus consists of T Weibo posts and U users that are aggregated into P user profiles, each user profile p contains N p words. The total number of topics denoted as K, the unique words in vocabulary are denoted as V.
There are five latent variables and one observable variable, where latent variable α is a K-dimensional vector giving uniform prior weight for all topics in a user profile p, latent variable β is a V-dimensional vector with uniform prior weights for all words in a topic k, latent variable z i is the topic for i-th word in user profile p and observable variable w i is the specific word, latent variable ϕ z is a V-dimensional vector representing the Dirichlet topic distribution for user profiles, latent variable ϑ p is a K-dimensional vector representing the Dirichlet word distribution for topics. The variable z i and w i are drawn from multinomial distributions. The generative process can be seen in Algorithm 1. The w-LDA model focuses on finding out topics for each user profile, and we use collapsed Gibbs sampling to inference the final goal that is to approximate the distribution of P (z i = j| Z −i , w i , p i ), which is: where P (z i = j| Z −i , w i , p i ) denotes as the probability that word w i is assigned to topic j, Z −i represents topic j assigned to all other words, w i represents the i-th word in the vocabulary, p i represents the user profile containing the word w i . Then n (w i ) −i,j is the number of times all word W −i assigned to the topic j excluding the current word calculates the total number of all words W assigned to the topic j. n (p i ) −i,j is the number of times topic j is assigned to words in the user profile P −i excluding the current user profile p i , ∑ K j=1 n (p i ) i,j calculates the total number of all words in user profile p i . ϕ z represents the predictive distributions of words in topic z, ϑ p represents the predictive distributions of topics in user profile p. For any obtained sample we can estimate ϕ (j) z and ϑ (j) where n i,j is number of times that word w i has been assigned to topic j, n i,j is the number of times topic j has been assigned to words in user profile p i .

Transportation Events Representation
After detecting topics from News articles and Weibo posts separately, we obtain the bags of words as event descriptions. The widely used one hot vector presentation for each topic word is unable to calculate the semantic similarity of topic words from different platforms efficiently, therefore, we trained the transportation word embeddings to represent topic words and calculated the semantic similarity between topic pairs. Finally, we linked and fused the topic pairs into event descriptions. Traditionally, in natural language processing each word is represented as a one-hot vector which is 1 at the position associated with the word an 0 at other positions. Clearly, the one-hot representation cannot capture any information about the semantic similarity between words. Moreover, the one-hot vector is high-dimensional and sparse. Recently, word embedding is proposed to represent each word in a continuous vector space and encode many semantic patterns [50,51]. Word embedding was firstly presented by Bengio et al. in [52], and implemented by Mikolov et al. in word2vec [53]. Since then word2vec has gained popularity for natural language processing [54], question and answer systems [55], information retrieval [56], recommending systems [57,58], sentiment analysis [59] etc.
The basic idea of word2vec is to combine a word and its contextual information together, and encode them into a low-dimensional vector. Words with similar contexts in the corpus are located in close proximity to each other in the representation space. Word embedding can be trained either by the Continuous Bag of Words (CBOW) model or the Skip-gram model. Both of them are neural networks which map word(s) to the target variable which is also a word(s), and the learned weights are word embedding representations. Specifically, the CBOW model is learning to predict the word by the context words, in contrast the skip-gram model is learning to predict the context words from the current word. The simplified structure of the two models is shown in Figure 4. Herein, we choose the CBOW model for training transportation word embedding.
However, computing probabilities in softmax layer is the most resource consuming phase when training CBOW models, since it requires summing over all words in the large vocabulary. Therefore, we use the Negative-Sampling [60] method to approximate the softmax layer in the CBOW model, which moves the embedding toward the neighbor words and away from the noise words. In this paper, the noise words are sampled from vocabulary according to their weighted the 3/4 power of unigram probability. The formulas of the above distances are listed in Table 1. Furthermore, a topic distance matrix can be defined as follows to measure the distance between T (i) w and T (j) n : Topic alignment is to align topics detected from different platforms. The topics from different platforms are latent variables, thus we do not know each topic labels like traffic jam and traffic accidents. Hence, we need to align the topic clusters detected from the News platform and Weibo platform into topic pairs, which will support the multi-view descriptions in the fusion step.

From bottom up, we choose a WE
Then, the topic T . Finally, the index pair (i, j) of aligned topics was returned.

• Events fusion:
After aligning the cross-platform traffic topics for each urban road, we can fuse the aligned topic pairs into a unified event description.
First, the anomalous words were removed. We scan the words list WE will be regard as an anomalous word: Second, the topic words from cross-platform were fused. If the shortest distance falls inside the region of one standard deviation σ * R T , the current word will be replaced with candidate WE (n,j) n for a more objective and formal event description. The detailed algorithm for event fusion is shown in Algorithm 2.

Algorithm 2. Transportation Events Alignment and Fusion
Input: T w , T n , WE weibo , WE news Output: AlignedTopicsList, FusionEventMatrix //Step 1: aligning the topic clusters in T w and T n , return AlignedTopicsList NewsAlignedIndex = j End for // Find the closest topic cluster Append (WeiboIndex, NewsAlignedIndex) to AlignedTopicsList End for // couple closet the topic clusters into pairs //Step 2: fusing the words in the cross-paired topics, return FusionEventMatrix For each index pair in AlignedTopicsList i = WeiboIndex; j = NewsAlignedIndex // calculate the low boundary and up boundary for news words

Data Description
The proposed methodology was applied to detect and fuse urban traffic events in Qingdao (a coastal city of China) from cross-platform social signals. Hence, first we used a social sensors network with 337 traffic keywords to collect data from News and Weibo which related to Qingdao transportation.
After obtaining the raw webpages, we removed the News articles which content length was greater or less than 90% of articles, and also removed the Weibo posts with a number of words of less than 5. Meanwhile, considering there are lots of social bots or online spammers in Weibo [61][62][63][64], we only retained the authors that published less than 10 articles in one day.
Next, we removed the punctuation, paragraph symbols, and noise patterns in the Weibo posts and news articles. Then we segmented texts into words, removed stop words, and tagged the words representing city roads with the LTP toolkit. After consulting the local transportation agency, we annotated 132 main roads in Qingdao, and aggregated the preprocessed texts into road data blocks with the criteria we mentioned in Section 3.
Finally, the multi-channel Qingdao transportation dataset from 1 August 2015 to 4 August 2017 was built. The dataset has about 1.15 million texts in total, including 301,684 News articles and 839,587 Weibo posts. The dataset was divided into training dataset, testing dataset and case study dataset, as shown in Table 2. In the following section, we separately used the training dataset to learning WBEF model parameters, the testing dataset to evaluate model's performance, and the case study dataset to discuss the practical application effect of proposed model in open scenario.

WBEF Model Verification
In this section, we first describe the implementation details of WBEF model, show the performance of each sub-model, evaluate the entire performance of the overall WBEF model, and compare the WBEF model with the baseline model.

Transportation Word Embedding
We utilized the CBOW algorithm to train transportation word embedding (200 dimensions). The window size is 5, which indicates the maximum distance between the current and predicted word within a sentence. The number of training epochs over the dataset is 10. The negative sampling algorithm is used to approximate the parameters' gradient, and the number of noise words drawn for current word is 5.
To evaluate transportation word embedding, we selected the most representative words in transportation events (i.e., traffic congestion, traffic accident), while searching for the most similar word embedding semantically. Table 3 gives top five most similar words for traffic congestion and traffic accident. The results show that our transportation word vectors contain essential semantic information.

Transportation Event Detection
During the experiments, we tried different K values (K = 1-5) for aggregating topic clusters, and when K = 3 we get more meaningful word clusters for every topic. We used the perplexity value to evaluate the LDA and w-LDA models, respectively. Since there were 132 data blocks that represent multi-channel social signal sensed from the main roads in Qingdao, so the LDA and w-LDA model were deployed on every data block for topic generation. Hence, the LDA and w-LDA model parameters were determined by the average perplexity of topic model on multiple data blocks as shown in Figure 5, the average perplexity of w-LDA converge to the value of 167 after 347 training iterations, the average perplexity of LDA converge to the value of 279 after 182 training iterations. Considering the model generalization on different roads in Qingdao, we choose the topic model which perplexity value was closest to the average perplexity. In order to validate the LDA and w-LDA topic models, we selected the three largest data blocks in the testing dataset to extract topics for the corresponding road. Herein, to intuitively show the results, we list the words in the largest cluster of T w and the words in the aligned cluster of T n for each road (Table 4).

Distance Metrics for Transportation Events Fusion
We tested different distance metrics for fusing traffic topics, and chose the proper distance metric for semantic event fusion. The top 10 largest road data blocks in the testing dataset were selected, and the topics were extracted from each data block with the LDA and w-LDA models, respectively. The words in the largest topic cluster T  Table 5 show that the most stable distance measure is the cosine distance for word embedding presentation.

Overall Performance
To evaluate the overall performance of the proposed model, we selected 10 roads in Qingdao with the most frequently occurring traffic events, checked the content of News and Weibo articles in every road data block, and manually annotated the traffic events including traffic accidents, traffic jams and traffic complements in the testing dataset. By consulting a traffic domain expert from the Qingdao Transportation Committee, the traffic events annotation results which consisted of 27 traffic accidents, 47 traffic jams, and 13 traffic complaints in testing dataset were finally confirmed.
Moreover, the dataset was built with keywords-based social sensors, thus there will be non-traffic News and Weibo articles which also contain traffic keywords, such as traffic products release, traffic safety lectures, and civilized traffic initiative, etc. Hence, if the events were widespread in social media or news, the model may also falsely flag these non-traffic events. In that case, the precision, recall and F 1 Score were adopted to evaluate the model performance. The proposed model was applied to detect and fuse traffic events on the 10 roads during one week. Meanwhile, we selected the "standard LDA + keywords matching" approach as the baseline model. Experimental results are given in Figure 6 and Table 6. Although the precision value of our proposed model (91.4%) is less than baseline model (92.6%), the recall value (85.1%) and F 1 score (88.1%) is much better than baseline. Overall the WBEF model surpassed the baseline model.  The experimental results shows that the baseline model "standard LDA + keywords matching" is unable to effectively process short content in Weibo, and also lacks semantic meaning when the different word styles from cross-platform media are fused, so fewer traffic events have been sensed. However, the event words both exactly occurred in News articles and Weibo posts imply the traffic events have been confirmed by both officials and the masses, which leads to higher accuracy of the baseline model. Compared to the baseline model, the WBEF model grouped the short messages into user profiles, then processed and clustered user-central context through the w-LDA scheme, hence traffic topics in Weibo can be detected more effectively. Meanwhile, with the WBEF model, the detected cross-platform topics can be embedded with semantics, which guarantees topic similarity calculation and event fusion, so more traffic events can be detected, resulting in much higher recall value, which effectively solves the problem of missing detection in the baseline model. However, the noise information in the dataset that is collected by the keywords-based social network sensors, also caused bias in the processing and fusion step, which leads to slightly lower precision.
Quantitatively evaluation, the F 1 score of WBEF model exceeded the baseline model by nearly 17 percentage points, which means the proposed model is much superior to the baseline model in sensing and detecting traffic events from multi-channel social signals. In the next section, we will choose one case, qualitatively study and discuss the traffic event detection effects of WBEF approaches in open scenario.

Case Study in Application Scenario
In this section, we chose the traffic situation in Qingdao on 4 August 2017 as one case, sensed multi-channel social transportation signals and deployed the WBEF model in the open scenario. As shown in Table 2, the case study dataset contains about one thousand articles. The WBEF model successfully identified 11 traffic events, missed two traffic events and false alarmed one traffic event, finally achieving 91.6% precision, 84.6% recall and an 87.9% F 1 score.
All the traffic events were mapped to the city roads, as shown in Figure 7, and the overall urban transportation situation can be visualized, which will intuitively support traffic management and traffic plan decisions. Meanwhile, in order to qualitatively investigate the practical usability and effectiveness of the proposed model, we chose the Top 3 hottest events (largest clusters) that occurred on different roads, and imported the corresponding road block data into the baseline and WBEF models. Although both models successfully detected Top 3 hottest events, the baseline model generated fewer words with less semantic. In contrast, the WBEF model presented more comprehensible and understandable words description of transportation events. Furthermore, we investigated event detail and corresponding causal factors. To clearly demonstrate event words, we manually categorized and tagged event words into target road words, relevant location words, traffic words, reason words, and other words (shown in Table 7). We also double-checked News and Weibo data related to the event and got insights into the event details. Three target roads with detected traffic events are detailed in the following. Gold Beach Road (金沙滩路): on this road, event fusion words included traffic words "parking place (停车场)", "traffic jam (交通拥堵)", "slow down (慢行)", "traffic broadcast (交通广播)", social activity words like "beer festival (啤酒节)", "opening ceremony (开幕式)", and famous star words like "Xiaoming Huang (黄晓明)". After referring to Weibo and News data associated to this event, we found that the annual Qingdao International Beer Festival was opening on the square of Qingdao Beer City, where the film star Xiaoming Huang and other famous stars presented a show for the celebration. In addition, the ceremony was opening at the evening rush hour (7 p.m.) which caused a heavy traffic jam on Gold Beach Road.
Tong An Road (同安路): with the proposed methods, we observed the location words "GuoXin stadium (国信体育场)" and traffic irrelevant words "Mayday (五月天, a singer group)", "concert (演唱 会)" in the event description. However, there was no words about traffic jams or accidents, but only traffic words like "detour (绕行)", "control (管制)", "dispatch (调流)", "plan (预案)" etc. We further checked the relevant Weibo and News data, then found that a concert would be held on the next day, so the traffic agency was providing early alerts about the traffic situation on the road, releasing the traffic control notice through social media and news. Massive fans and audiences forwarded and disseminated these posts.
Yan An San Road (延安三路): The event words include traffic words like "traffic jam (交通拥堵)", "evacuation (疏散), and also fire emergency words like "fires (火情)", "fireman (火警)". Associated with the location word "Petroleum Building (石油大厦)", we inferred there was a traffic jam on "Yan'An San Road (延安三路)" which was caused by a fire emergency. After further checked, the real fact was consistent with the detected results from cross-platform media. The elevator in the Petroleum Building caught fire at 7 p.m. Lots of citizens posted the live fire and traffic situation, gave notices and conveyed safety messages to their family. The official announcement also reported the event through news and social media.

Conclusions and Future Work
In Intelligent transportation systems, the method of analyzing social signals (social transportation) offers advantages of low cost and large coverage over traditional methods which depend on physical sensors. In this paper, we addressed the challenges of cross-platform traffic event detection when shifting the social signals from a single channel to multiple channels. The WBEF model for urban transportation event detection has been proposed, which benefits from comprehensive social signals, domain knowledge and semantic representation. The model was trained with about 1.15 million News and Weibo data from the past 2 years, and deployed to assess the traffic situation in Qingdao. Experiments show that the sub-models of WBEF achieved the expected performance, and the overall performance of WBEF is much superior to the baseline model. Moreover, from the case study in the open scenario, the accuracy and robustness of WBEF have been further demonstrated.
Further investigation can be conducted: firstly, more social signal sources can be involved into the WBEF model, such as Instagram, Facebook, Quora, etc., which will make traffic incident detection more accurate and comprehensive. Secondly, powerful deep learning methods have highly potentials to improve accuracy and robustness for cross-platform event detection. Thirdly, the WBEF model can be extended to heterogeneous recommending system [65,66], which will achieve more personalized and accurate information services in transportation domain. Furthermore, social sensors combine with physical sensors (Cyber-Physical-Social-System, CPSS) will lead a novel way to monitor, control, and optimize intelligent transportation systems.