A Survey of Methods and Technologies for Congestion Estimation Based on Multisource Data Fusion

Cvetek, Dominik; Muštra, Mario; Jelušić, Niko; Tišljarić, Leo

doi:10.3390/app11052306

Open AccessReview

A Survey of Methods and Technologies for Congestion Estimation Based on Multisource Data Fusion

Faculty of Transport and Traffic Sciences, University of Zagreb, HR-10000 Zagreb, Croatia

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2021, 11(5), 2306; https://doi.org/10.3390/app11052306

Submission received: 8 February 2021 / Revised: 28 February 2021 / Accepted: 2 March 2021 / Published: 5 March 2021

(This article belongs to the Special Issue Transport Mobility in a Changing Word: Survey Applications, Sensors and Automatic Data Collection Methods)

Download Versions Notes

Abstract

Traffic congestion occurs when traffic demand is greater than the available network capacity. It is characterized by lower vehicle speeds, increased travel times, arrival unreliability, and longer vehicular queueing. Congestion can also impose a negative impact on the society by decreasing the quality of life with increased pollution, especially in urban areas. To mitigate the congestion problem, traffic engineers and scientists need quality, comprehensive, and accurate data to estimate the state of traffic flow. Various types of data collection technologies have different advantages and disadvantages as well as data characteristics, such as accuracy, sampling frequency, and geospatial coverage. Multisource data fusion increases the accuracy and provides a comprehensive estimation of the performance of traffic flow on a road network. This paper presents a literature overview related to the estimation of congestion and prediction based on the data collected from multiple sources. An overview of data fusion methods and congestion indicators used in the literature for traffic state and congestion estimation is given. Results of these methods are analyzed, and a disseminative analysis of the advantages and disadvantages of surveyed methods is presented.

Keywords:

traffic congestion; multisource data fusion; traffic flow modeling; congestion estimation; traffic state estimation

1. Introduction

Congestion on the road network is, by one definition, reduced quality of service caused by a network carrying more vehicles than it can handle. Congestion estimation is a significant task helping to mitigate the air pollution, by tackling the problem of lower vehicle speeds and increased time delays. If the congestion intensity in the network is known, appropriate optimization, to reduce the negative side effects of traffic, can be applied.

Estimation of the traffic congestion can be significantly improved by combining data from various sensors and other sources. In the literature, this process is called data fusion (DF). DF of multisource traffic data can be defined as the process of combining data or information to more accurately estimate or predict traffic flow conditions. According to Qing [1] and Dailey et al. [2], multisource DF is a technique in which data from multiple sensors are combined to create a synergic process for providing comprehensive and accurate information; in this context comprehensive information means improved, less expensive, and higher quality information. Therefore, DF is a group of techniques by which information from multiple sources is combined to reach a better usability of data and provide a better description of traffic scenario. A general analysis of traffic state tries to identify the traffic flow conditions at which the road would be operating at an optimal level and define congestion as the difference between actual, observed, and optimal conditions. To determine a realistic and accurate state of the traffic flow and predict the travel time, traffic engineers need comprehensive spatiotemporal traffic data. Since every traffic data collection technology has its disadvantages, multisource data collection is often used to cope with the disadvantages and harness the advantages of the used technologies.

In that regard, the main contributions of this review are:

A systematic literature review through a keyword-based search of academic research databases and a systematic identification of highly relevant papers, according to the impact made on scientific community, from the search results.
Coverage of traffic congestion estimation studies in urban networks based on the data fusion from multiple sensors.
Analyzation of the objectives of congestion estimation and data fusion, e.g., improving efficiency or accuracy, together with different data fusion methods used according to their performance.

In this paper, an overview of the most used multisource data collection technologies is given, and methods for DF and congestion indicators for congestion or traffic flow estimation are presented. Its aim is to fill a research gap in the review of methods and technologies for congestion estimation based on multisource data fusion. The main question to be answered is: which data collection technologies were used for data fusion and which data fusion methods are mostly used according to their performance? Section 2 gives the background of the conducted research presented in this paper. In Section 3, a short overview of commonly used quantitative congestion indicators is given. Section 3 gives an overview of commonly used traffic data collection technologies for DF and the advantages and disadvantages of each technology. In Section 4, a representation of different DF methods and techniques used for traffic congestion estimation is provided. In Section 5 a discussion of the recently proposed methods is given, and, finally, Section 6 brings conclusive remarks.

2. Background

2.1. Review Methodology

As mentioned in the introduction, we used a keyword-based search and selected work published mostly during the past twenty years, from 2000 to 2020. We performed the search of the most used scientific research databases which cover research from the field interesting for this review: Scopus, IEEE Xplore, Web of Science (WoS), and Google Scholar. We used keyword-based searches to identify and filter relevant articles based on appropriately selected keywords (traffic congestion; multisource data fusion; data fusion, traffic flow modeling; congestion estimation; traffic state estimation). Some older sources were also mentioned to provide the information of the previously developed methods. The main part of this research includes 30 papers which are thoroughly discussed, and their methods and results are summarized in the following sections.

2.2. Congestion Definition

Traffic congestion occurs when the volume of traffic exceeds the available network capacity [3,4]. There is no complete consensus about the congestion definition, nor how it should be estimated and represented to cover every situation because the contexts are different; the capacity, which is the key factor in defining and estimating the congestion, has randomness and it is difficult to exactly detect and measure it in the field. However, congestion is characterized by lower vehicle speeds, increased time delays in planning trips, arrival unreliability, and vehicular queueing longer than in the free flow conditions [3]. Congestion is a non-linear function, meaning that, as a road approaches its maximum capacity, small changes in traffic volumes can cause non-proportionately larger delays. Congestion can also be recurrent or non-recurrent, where recurrent congestion occurs on a daily, weekly, or annual basis. This type of congestion is caused by recurring bottlenecks, which happen when traffic demands are greater than the capacity of network elements, i.e., road segments and intersections. Non-recurrent congestion is unexpected and usually unpredictable or with a small percentage of predictability caused by traffic incidents, vehicle breakdowns, work zones, weather, and special events [5]. Two mentioned main forms of congestion are important to distinguish whether the goal of the research is to find a pattern or to estimate the frequency and predictability of the congestion. However, even recurrent congestion can show a large degree of randomness, especially in its duration and severity [3].

2.3. Congestion Problem and Impact on Society

Many traffic experts and companies worldwide deal with congestion issues and try to find various solutions to mitigate this problem. Congestion is a big issue in most cities around the world and significantly affects citizen mobility. For example, in 2019, according to INRIX Research, the top five most congested cities were (1) Moscow, (2) Istanbul, (3) Bogota, (4) Mexico City, (5) São Paulo [6], where three of five cities overlap with HERE’s ranking of the top 5 most congested cities [7]. INRIX Research calculated the congestion impact rank based on a city’s population and the delay attributable to congestion. Ranking by hours lost in congestion shows that 8 of the top 10 congested cities globally are in Europe. The reason for this probably lies in old downtown areas with narrow streets, which can be traced back to the Roman period [6], with a small capacity. According to TomTom’s ranking, the top five most congested cities worldwide are (1) Mexico City, (2) Bangkok, (3) Jakarta, (4) Chongqing, (5) Bucharest. They calculated the congestion level as an increase in overall travel times compared to the free flow or uncongested situation. By analyzing the top five cities, citizens spend approximately 50% more time traveling in congestion conditions than in free flow conditions [8]. Travel speeds, congestion level, and time-loss positively correlate with population and density of population in a city [6]. However, the authors in [9] warn that the INRIX and TomTom congestion indices can be pessimistic. Their congestion indices are based on the speed data collected by their subscribers, who tend to drive in congestion more than average, and so exaggerate the congestion level for the average driver. Obviously, large cities usually create larger GDP and have possibilities of developing adequate infrastructure. The authors in [10] concluded that cities with higher GDP generally build more infrastructure, which helps in reducing overall congestion.

Traffic congestion is not just a problem for commuters, it also has an impact on the economy [11] and the environment [12]. Hao et al. [13] established a framework for traffic-related evaluation of air pollution using mobile crowd-sourced data, such as cellular network data and Global Navigation Satellite System (GNSS) data. Economic impact, combining direct costs, like delay in travel times, fuel wasted, and indirect ones, such as pollution by generated emissions, reduction in life satisfaction and driver stress [9]. INRIX Research estimates congestion costs in the USA, UK, and Germany for the top 10 most congested cities to be around USD 46B generated in the USA, GBP 6.8B in the UK, and EUR 5.2B in Germany [6]. The congestion costs depend upon the labor market, industrial sector, mode of transport, trip distance, and travel conditions. To assess these costs, public sector managers need accurate operational models. Woensel and Cruz [14] used stochastic queueing models for congestion costs calculation. Ali et al. [15], used three outgoings for estimating congestion cost: opportunity costs, vehicle operating costs, and fuel consumption quantity. Opportunity costs use delay, number of vehicles, average vehicle occupancy for a specific mode of transport, and value of time for a specific mode of transport estimated using a socio-economic survey. Vehicle operating cost consists of fuel cost for a specific mode of transport and delay. Consumption quantity includes fuel price of specific fuel types (Gasoline, Diesel, Compressed natural gas). In [16] authors analyzed traffic congestion costs related to the road transport of passengers. They presented an analysis of two cost factors of road traffic congestion: productivity time loss and fuel energy consumption. Results indicate that personalized transport significantly contributes to traffic congestion and imposes significant losses to the economy.

To mitigate the congestion problem, traffic engineers use a variety of solutions, such as congestion pricing, encouraging multimodal and public transport, and developing Intelligent Transport Systems (ITS) applications [17,18]. Hensher [19] argues that switching to the sharing economy and relinquishing private car ownership, or switching to connected autonomous vehicles, can potentially reduce congestion. A different strategy to reduce congestion can be achieved by adopting teleworking [20], which includes various programs and activities that substitute physical travel with new telecommunication technologies. Teleworking is defined as work performed from a distance, typically online, which eliminates travel to work offices. This can be tested especially in times of COVID-19 [21], where a complete or partial lockdown causes a reduced number of trips to work offices. In addition, the authors in [22] suggest that moving consumers from physical stores to online stores can potentially alleviate traffic congestion by preventing consumers from driving to physical stores.

2.4. Data Fusion in ITS

Data fusion is applied in many areas: intelligent transport systems, bioinformatics, cheminformatics, geospatial information systems, oceanography, wireless sensor networks. There are several papers giving a review of DF in ITS. El Faouzi in [23] provided a survey of how DF is used in different areas of ITS, such as: Automatic Terminal Information Service, Automatic Incident Detection, Advanced Driver Assistance, Network Control, Crash Analysis and Prevention, Traffic Demand Estimation, Traffic Forecasting, and Monitoring. El Faouzi introduced two levels of DF: the first level involves fusion to provide only raw and uncorrelated data to the end-user, and the main methods are data association and positional estimation using Kalman filter. The second level delivers meaningful information from raw data for guiding human decision-making, and the methods include pattern recognition using adaptive neural network and clustering methods and identify fusion using Bayesian Decision Theory and Dempster–Schafer evidential reasoning.

Mihaylova et al. in [24] separated the DF process through the six-level hierarchy based on DF goals. Level 0 encompasses source preprocessing to address estimation and compression of input data. Level 1 includes data analysis from all appropriate sources, e.g., point, point-to-point, and area-wide sensors. Level 2 includes a complete and timely assessment of the observed data and information from external sources (weather reports, seasonal traffic patterns, special events and construction schedules, and others) by incorporating relations among the entities of interest. Level 3 evaluates traffic flow patterns and external sources for assessing the occurrence of an incident, travel time delays, and other events that influence traffic flow. Level 4 improves the effect of the fusion process through traffic planning and control. Level 5 processing is focused on issues related to human support in cognitive decision making and action taking based on the fused information.

El Faouzi et al. in [25] made a review of the state of practice and prospects for DF in the management of the travel demand. The proposed system architecture requirements, several DF models, and a brief review of major relevant industry players in data provision, data aggregation, and delivery to end users. El Faouzi and Klein in [26] warn that in this field, there are still some remaining challenges. This includes the need for obtaining data with the necessary accuracy to make dynamic and real-time ITS applications. They see the potential in the development of methods to combine traffic sensors and human-generated data.

3. Quantitative Congestion Indicators

Traffic engineers use different approaches to present and describe the state of traffic flow and to estimate the degree of congestion. The common approach is to describe the state of traffic flow using traffic flow parameters and the fundamental diagram proposed in 1935 by Greenshield [27]. Another approach relies on the usage of Lighthill– Whitham–Richards (LWR) models, first introduced in 1955 by Lighthill and Whitham [28], and independently in 1956 by Richards [29]. The three-phase traffic theory, developed by B. Kerner between 1996 and 2002 [30,31,32], proposes a division of congested traffic into two distinct phases, synchronized flow and wide moving jam.

Some traffic flow parameters or combination of some parameters can be used as quantitative congestion indicators. Fundamental traffic parameters are flow rate q (veh/h), density ρ (veh/km), and speed v (km/h) according to [33] with basic relationship q = ρ v. Other parameters used to describe the state of traffic flow are given in Table 1. [1,27,34].

In [35], the authors state that the key performance indicators for urban roads are delay, density, and Level of Service (LOS). LOS is usually measured by speed, density, and volume/capacity ratio [36], and it presents a quality indicator of the road network service.

There are also hybrid measures, which can be fused by combining two or more measures. The authors in [37] combined flow, measured using loop detectors, and travel time, measured using GNSS probe taxi vehicles, to calculate the link density and determine the shape of the Macroscopic Fundamental Diagram (MFD).

There are also other indicators regarding the state of traffic flow, which can be provided by GNSS probe data, like Proportion Stopped Time (PST), and Acceleration Noise (AN). The authors in [38,39,40] used several indexes such as travel time index, space mean speed index, acceleration noise index, buffer index, and planning time index for estimation of congestion on the link or network. Indexes for a transport link and network congestion estimation are also mentioned in the survey [35]. In Table 2, a short overview of congestion indexes is given per [41]. Hybrid indicators that combine two or more parameters also can be used for congestion description.

4. Data Collection Technologies

According to [42], traffic data collection technologies can be classified into three groups based on functionality: point sensors, point-to-point sensors, and area-wide sensors. Here, the word sensor refers to a traffic flow sensor, i.e., device or system that can collect traffic flow data. Point sensors include various technologies such as: inductive loops, piezoelectric sensors, video image sensors, radars, infrared sensors, acoustic sensors, pneumatic road tubes, and magnetic sensors. These sensors are usually limited in spatial coverage and are used for measuring traffic volume, speed, occupancy, and other traffic flow parameters [43,44,45].

Point-to-point sensors detect vehicles at multiple locations throughout the network and are often called automated vehicle identification (AVI). Major technologies used for point-to-point detection are Bluetooth, Wi-Fi, RFID, and Automatic License Plate Recognition (ALPR). These technologies are suitable for computing travel times, route choice fractions paths, and origin-destination (O-D) flows [46,47,48].

Some technologies do not necessarily belong to only one group and can be used as either point or point-to-point sensors. There are several papers where researchers use inductive loops for vehicle reidentification and travel time estimation [49,50,51]. Cameras and video- and image-processing are used for collecting traffic data as point and point-to-point sensors [52,53,54,55].

Area-wide sensors include data collection technologies that allow tracking of vehicles over a large area. The most promising are Floating Car Data (FCD) and Cellular Floating Car Data (CFCD). FCD data are generated by smartphones or vehicles equipped with GNSS receivers, also known as GNSS probe data. These vehicles are mostly part of the fleet, e.g., taxi service, public transport service, or some company fleet. FCD and CFCD are used for calculating a wide range of useful traffic parameters, such as space mean speed, travel time, O-D matrices [56,57,58,59], and queue length [60,61,62]. An often used term in the literature, crowdsourced data, are data gathered from social media [63]. Posts from social media (e.g., Facebook, Twitter, WeChat, Sina Weibo) must be geotagged to collect various traffic events, such as traffic accidents and jams. Posts from social media usually have low reliability, caused by users who can generate random content in random places at random times and provide mixed information that is inaccurate. Still, if properly handled, it can be easily combined with conventional information sources. Technologies that provide data collection from connected vehicles [64] and airborne and satellite imagery also belong to area-wide sensors [65,66,67].

Based on location of mounting, fixed sensors can be classified into two categories: intrusive and non-intrusive sensors [68]. Intrusive sensors are installed on or in pavement, which include inductive loops, magnetic sensors and piezosensors. Non-intrusive sensors are placed above or next to the road, e.g., on poles or consoles. The main advantages of intrusive over non-intrusive sensors are high accuracy in detecting vehicles and the negligible impact of weather conditions. The main advantages of non-intrusive over intrusive sensors are faster and cheaper installation, often without interruption of traffic flow, and multiple flexible detection zones with most of these sensors.

Every technology has its advantages and disadvantages with several characteristics that can affect their effectiveness and suitability for certain purposes: for example, ability to collect traffic flow parameters of interest, accuracy, reliability, price, installation cost, privacy, etc. In Table 3 the main advantages and disadvantages of certain sensors are shown.

Traffic information from only one data source is insufficient to meet the need of providing a real-time traffic congestion estimation in a large city [69]. For many ITS applications, the information provided by individual sensors is incomplete, inaccurate and/or unreliable [70]. Using multiple sensors for data collection helps in overcoming the disadvantages of some sensor technology and allows for a better assessment of the traffic states. Furthermore, multisource data can be useful for modeling people’s travel trajectories and imply potential mobility patterns [70].

According to [71], the DF model can include people, web bots, or data fused at lower levels as data sources in addition to the sensors themselves. Table 4 summarizes the literature review with the aim to present the data sources used in DF for congestion estimation needs.

Wang et al. in [71] combined social media data and GNSS probe data to understand urban traffic congestion better. As an extension of this work in [69], the authors model the traffic congestion using GNSS probe data and social media data (Twitter). To model the traffic congestion more accurately, they also extracted rich auxiliary information such as social events, physical features of road, point of interest features, and weather information. The discovered traffic co-congestion patterns are then used to detect anomalies in the arterial network and better estimate traffic conditions of a large arterial network. Kong et al. [72] developed a real time fusion-based system using GNSS probe taxi vehicle data and loop data from the Sydney Coordinated Adaptive Traffic System (SCATS). In review [73], authors separated multisource data on the traditional research data (survey data, bank notes, call detail records), and popular urban data (GNSS data, public transport data, social media check-in data) used for modeling and predicting human-mobility patterns. Zhu et al. in [74] used three different data sources: GNSS data obtained from buses, inductive loop data, and mobile phone network data, while automatic number plate recognition data were used as a ground truth. For the traffic speed prediction in [75], data was obtained from inductive loop devices in combination with weather data. Novel research [64] proposed a combined method for estimating traffic conditions fast and accurately by using connected vehicles combined with stationary detectors. Croce et al. [76] proposed a more accurate visualization of traffic conditions using specific elements of Transport System models zones and graphs. Procedure included DF of GNSS and traditional survey data. In [77], authors develop a framework for traffic state estimation using data collected from point detectors (loops, cameras, and radar) and probe data (GNSS and Bluetooth). The framework can filter, fuse, and process data from various sources in real time, providing a reliable Advanced Traveler Information Service (ATIS). The authors in [78] developed a new convex optimization framework for route flow estimation problem using a fusion approach for loop detectors and cellular signal traces.

Most of the authors used GNSS probe data and inductive loop data for DF. An inductive loop is a well-known technology, and in combination with GNSS probe data, it can overcome mutual shortcomings. Most authors use GNSS probe data because of their large spatial coverage and potential for real-time traffic state estimation.

5. Representation of Methods Used in Data Fusion

Challenges of DF are imperfection, inconsistency, confliction, alignment and correlation, and heterogeneity of the type of data. Researchers are dealing with these challenges to improve efficiency and quality in traffic estimation, classification, and prediction tasks. In [69] and [71], to combine multisource data, the authors proposed a coupled matrix and tensor factorization scheme named TCE_R. A method called search tree-based pattern mining is proposed to efficiently discover which road segments, geographically close to each other, are likely to experience the co-occurrence of traffic congestion. Recursive Kalman filter-based approaches provide a solution for traffic state estimation and DF. However, when data cannot be straightforwardly aligned over space and time, equations become computationally expensive. Therefore, the authors propose three alternative DF approaches that solve this problem and are tailored to fuse different traffic sensor data. The so-called PISCIT, FlowResTD, and Treiber–Helbing filter (EGTF) are able to fuse multiple data sources, as long as for each of these it is possible to estimate under which traffic conditions the data were collected (congested or free flow) [86].

Wang et al. [71] used GPS probes and traffic-related information collected from social media to estimate urban traffic congestion more accurately. Additionally, they extract auxiliary information, including road congestion correlations, social events, road features, and points of interest. The results are evaluated on the real arterial network and show the effectiveness and efficiency of the proposed method. Okawa et al. [87] use the Deep Mixture Point process for event prediction in urban areas. This approach can use highly dimensional and multisource data to benefit from data in a rich urban context. Salanova [77] proposes a framework for data collection, filtering and fusion to get real-time traffic estimation and short-term travel time prediction. The framework for DF uses the Data Expansion Algorithm explained in [88], where the authors merged estimated traffic flows and used the proposed link-based on volume-delay function to assign travel times and average speed values to links. Wu et al. [78] proposed a fusion of cellular network and traffic sensor data for route flow estimation. These two very different types of data are highly uncorrelated because cellular network data cannot be exactly mapped onto the road network. To overcome this lack of spatial information, caused by a rather sparse network of cell towers, authors introduced cellpaths, defined as a trajectory vehicle passes between two cell towers. Route flow prediction is finally achieved through convex optimization formulation using a map, cellpath flow, link flow, and O-D flow data. The problem, in this case, is the division of cellular network in different cells, and that was achieved by division in so-called Voronoi cells, which partition the area into cells by observing coverage from a certain cell tower, located in the center of each cell. The authors proposed different models of combined usage of cellular and loop-based data but their problem was the actual availability of real cellular data for privacy reasons. This probably remains the largest problem in similar approaches because different countries have different restrictions for sharing even anonymized cellular data. Kong et al. [89] propose an online information fusion approach for urban traffic state estimation. The approach consists of three parts of algorithms, including the evidential fusion, the data processing of loop detectors, and the data processing of GPS probe vehicles. In the application of the evidential fusion model, both loop detector data and GPS probe vehicle data are integrated so that the traffic states can be more comprehensively estimated and with more accuracy with using only one of them. The proposed approach can well balance the requirement between accuracy and real-time performance. Toole et al. in [90] presented a flexible, modular, and computationally efficient software system. The system estimates multiple aspects of travel demand using Call Detail Records (CDRs) from mobile phones in conjunction with open- and crowd- sourced geospatial data, census records, and surveys. Authors used algorithms to construct O-D matrices, presenting route trip through a road network. Additionally, they presented an online, interactive visualization platform to communicate these results to researchers, policymakers, and the public. The system flexibility is tested on multiple cities around the globe.

5.1. Statistic Methods

Wang et al. [79] reported a study on how to explore social media as an auxiliary data source and incorporate it with GNSS probe data to enhance estimation of the traffic congestion. The authors extensively collected tweets that report various traffic events such as congestions, accidents, and road constructions. Next, they proposed an extended Coupled Hidden Markov Model, which can effectively integrate GNSS probe readings and traffic-related tweets to estimate traffic conditions of an arterial network more accurately. The experimental results demonstrated the superior performance of the model by comparison with previous methods. Zhu et al. [74] used three different data sources, namely bus-based GPS data, inductive loop detector data, and mobile phone network data that are combined using three different DF techniques. The hybrid method outperforms the weighted mean approach and artificial neural networks to fuse multiple data resources and produce more accurate travel times. The results indicate that fusing multiple data together does not necessarily enhance the accuracy of travel time estimation. Travel time estimation depends on the reliability of individual data sources. Fusing highly correlated data sources can lead to a worse result. The results also show that even in dense urban areas, GPS data, when combined with inductive loop detector data, can provide reasonable travel time estimates of general traffic stream under different traffic states. Zheng et al. [84] used a slightly different approach and tried to combine data from social media platforms with GPS data collected from taxies. The authors collected data in 2014, for 19 days, from both sources. The social media data were collected from the largest micro-blog platform in China, Sina Weibo and were filtered through a list of keywords which are somehow related to transport in general. The first step was to match the GPS data on the road network and detect the path with anomalous travel time according to the historical measurements. After that they applied DF of GPS and social media data by selecting the proper messages related to names of nearby roads and landmarks. The social media data, of course, do not contain a proper geolocation, and to solve that problem authors used a rectangular search area of one square kilometer. After statistical analysis, it was concluded that after a non-recurring traffic incident a large number of messages is created, making them a good tool for detection and analysis of unexpected events in the traffic flow. Li et al. in [85] proposed an extended generalized filter algorithm for the urban expressway traffic state estimation. To estimate the traffic state, they used multiple sources of data from fixed sensor data (inductive loops or radar data) and GNSS probe vehicle data. Patire et al. [80] proposed a hybrid data framework; they incorporated GNSS data with loop detector data for real-time travel time estimation and concluded that using fused data gave better results. The authors concluded that better travel time estimation might be achieved by fusing a relatively small amount of probe data other than by doubling the number of loop detectors. Sohn et al. [91] introduced DF to classify the severity of road traffic accidents. The authors described various fusion methods (Dempster–Shafer, Bayesian and Logistic methods) and ensemble algorithms (Bagging and Arcing) to improve the classification accuracy and discrimination power. Fusion algorithms (Bayesian procedure showed best results) display better discrimination power than a single classifier. Dempster–Shafer algorithm appears to improve the classification performance in terms of classification accuracy. Results indicate that a clustering-based classification algorithm works best for road traffic accident classification in Korea. Jiang et al. in [81] aimed to modify the extended generalized Treiber-Helbing filter (EGTF) to fuse GNSS data (probe vehicles) and traditional traffic data from loop detectors, to enhance more accurate estimations of traffic states (speed, travel time) and emissions on urban expressways. Choi et al. [92] proposed an algorithm for fusing travel time focusing on loop detector data and GPS probe vehicle data. The algorithm and procedure involve voting technique, fuzzy regression, and Bayesian pooling method. The results showed that the fused travel time is superior to the pure arithmetic mean method. The method produces more accurate, reliable, and realistic travel time. To more accurately estimate the MFD, the authors fused loop detector data and GNSS-FCD [37,82]. In [23], the authors used a simulation of an abstract grid network to validate results. Seven multi-sensor DF-based estimation techniques were investigated in [70]. All methods were implemented and compared in terms of their ability to fuse data from loop detectors and probe vehicles (Bluetooth and GNSS) to accurately estimate freeway traffic speeds. Results show that most DF techniques improve accuracy over single sensor approaches. The analysis shows that the improvement by DF depends on the technique, the number of probe vehicles, and the traffic conditions. Mil and Piantanakulchai [93] used spurious data and traffic conditions to estimate the travel time. To achieve this, they used the Bayesian DF approach, combined with the Gaussian mixture model, to fuse the travel time data: Data originate from different types of sensors to improve accuracy, precision, and completeness in terms of spatial and temporal distribution. The difference in traffic conditions classified using the Gaussian mixture model, and the bias estimation from individual sensor by introducing a non-zero mean Gaussian distribution learned from the training dataset were added to the model. To prove the concept, the authors used measured data as input to simulator and multiple types of simulated sensors: loop detector, GPS, and virtual trip line. Travel time was modeled using Gaussian mixture models with two or three components, depending on the state of traffic flow characteristic for a certain road. As a result of the presented work, authors reported an improvement in accuracy of at least 16.3% over mean absolute percentage error of the baseline model with reduced standard deviation up to 6.03%.

5.2. DNN Methods

Essien et al. [75] investigated the influence of weather conditions on traffic speed in urban conditions. To achieve this, the authors used the Long Short-Term Memory Neural Network (LSTM-NN). In comparison to classical artificial neural networks, LSTM-NN has the ability to “forget” or store information over a longer period of time. This feature made models using LSTM-NN superior in predicting the speed than SVM, Kalman Filter, and ARIMA. The authors used data obtained from inductive loops and weather data: rainfall, and temperature. For the testing scenario, they decided to use an urban arterial road in Greater Manchester and obtained the best prediction results in terms of minimal absolute error for the model, which used a combination of both weather data and inductive loops. This method outperformed ARIMA by a couple of orders of magnitude. Chou et al. [94] proposed a deep learning-based framework with the integration of road network, weather, and traffic data for predicting the long-term traffic time, called Deep Ensemble stacked Long Short-Term Memory (DE-SLSTM). For the difficulty of predicting the traffic time during congestion, they adopt the “cost sensitive” mechanism in the proposed framework to improve the prediction accuracy during rush hours. The proposed framework fits the ground truth better than Google maps and demonstrates good performance. Rodrigues et al. [95] turned to using deep learning architectures to combine text information with time-series data. This approach was used to solve the problem of taxi demand forecasting in New York. By fusing two complementary cross-modal sources of information, the authors showed that the proposed models can significantly reduce the error in the forecasts. Using textual information is rather difficult because it needs to be made understandable for the neural network to process it. The way authors are used here is by converting words into sorted integers, so that closely related words are closer in the number-space. Using the standard representation of successfulness, the authors empirically demonstrated that fusing these two very different data sources leads to significant forecasting error reductions.

In the image processing domain, one of the most used approaches is multiple image fusion or multiple feature fusion extraction. Li et al. [96] preprocessed original images of pavement cracks using different values of standard deviation for the Gaussian blur method. Pavement cracks are detected for every preprocessed image, and the results are then fused into the final image. Often, CNNs are used after fusion for classification or feature extraction purposes. Ke et al. [97] used CNN on the traffic cameras’ images to automatically extract the features to estimate the occupancy and the traffic flow. The results are then fused with images that represent density and velocity estimations to achieve final goal of traffic state estimation. Hu et al. in [98] were training a CNN with the multiple features extracted from the images of the drivers in the car. The result was the algorithm for recognition of driving behavior able to detect dangerous actions of the driver. Guan et al. [99] were using a CNN in the fusion of visible spectrum and multi-spectral images. The CNN was trained to detect pedestrians in the images captured in different illumination scenarios.

Some authors are using the same dataset but fuse the results of the multiple methods. In [100], the authors are fusing the results of the K-Nearest Neighbor (KNN) with the LSTM. KNN is used to capture spatial features and detect most similar locations. Then, an LSTM is used to predict the traffic flow on the observed locations. In [101], the authors are combining Support Vector Regression (SVR) and the LSTM to detect the abnormal passenger flow on the observed urban rail transit network. The SVR is used to compute a steady passenger flow volume series, and the LSTM to model the large fluctuations in the traffic flow. Then, the results are combined into one framework. In many papers, authors use Artificial Intelligence, advanced filters (Kalman filter, Treiber–Helbing filter) or matrix factorization, to fuse the data or decisions based on collected data. In Table 5, DF methods are presented, and it is obvious that two approaches dominate, namely statistical and deep learning methods. We can also see that statistical methods are applied in the time horizon from 2002 to 2020, and deep learning methods are currently being applied, starting from 2019, as expected because of the popularity deep learning recently.

6. Discussion

Research conducted in this review shows that using multisource DF can increase reliability and robustness of estimation, because one data collection technology can contribute traffic information where others are unavailable, unreliable, or ineffective. Multisource DF can also decrease costs because usage of several low-budget sensors, in combination with the correct date processing, can achieve the same level of accuracy as the usage of only one more expensive sensor. For multisource DF, authors mostly used a combination of inductive loops, as point sensors, and GNSS data, as area-wide sensors. Potential for determining the most accurate estimation of the state of traffic flow is the use of at least one sensor from each group: point, point-to-point, and area-wide sensors. For example, point sensors (radars or inductive loops) to gather an accurate situation at the road segment or network link, point-to-point sensors (Bluetooth) for enriching area-wide, and area-wide sensors (GNSS data) for gathering a more accurate situation in the traffic network.

From the survey made on the selected papers, several potential research directions to address the limitations of the existing methods were identified. Two dominant approaches in estimation of congestion that were mostly used lately are statistical and deep learning methods:

Statistical methods can provide insights into the traffic flow conditions but fail when dealing with complex and highly nonlinear data. A statistical method is used either to offer insights about relationships within the data, its structure, or to create a model that can predict future traffic states. Statistical methods have solid and widely accepted mathematical foundations which makes them more “understandable” than some deep learning methods.
Much exploited, deep learning approaches create models that are “intelligent” and use a large amount of data to get useful insights into the traffic flow and detect various patterns. Deep learning is more flexible than statistics but there is not always a mathematical explanation of why a certain approach works better. Two approaches in using deep neural network (DNN) methods can be observed: (i) combination of image processing-related methods which use convolutional neural networks, and (ii) time-series analysis which uses the long-short term memory network. DNNs have been widely applied to various transportation problems, partly because they are very generic, accurate, and convenient mathematical models able to easily simulate numerical model components. They have been mainly used as a data analytic method because of their ability to work with massive amounts of multidimensional and multisource data. DNN methods are more flexible than statistical methods; the functional form is approximated via learning and not a priori assumed as it is the case with statistics [105]. On the other hand, DNN based models can be computational and memory expensive [106].

Comparing efficiency and performance of these methods is very difficult, if not impossible, and the reason is the usage of completely different testing scenarios for different methods and approaches by different authors. All referenced methods claim rather high accuracy and good performance in the given environment, but it is not clear how they would perform in different conditions. To overcome that, it might be beneficial to test methods using synthetic data which would contain various modalities (sensor data). Only that kind of a test would yield a good comparison of different approaches aiming at solving the same problem.

To estimate performance indicators of the traffic flow and to determine congestion, authors mostly used travel time or flow speed. At the time of writing this paper, there was no work trying to fuse congestion indexes mentioned in Table 2 with the above-mentioned methods. It remains necessary to explore and determine if it is possible to estimate congestion and get realistic and more accurate information about the performance of the traffic flow based on congestion indexes fused with multi-sensor technologies. Multisource DF enables getting information about the state of traffic on network links with higher resolution, allowing it to be incorporated into the overall picture of a traffic network. For example, data collected from point detectors (radar, video, inductive loops) can measure traffic flow in each traffic lane separately. The traffic volume from one traffic lane to another can significantly differ, and that is the information that point-to-point or area-wide sensors cannot provide. This means that only combination of sensors can provide such detailed or high-resolution information of the traffic flow performance on a network link. This detailed information can be used in advanced ITS applications, such as dynamic traffic management (DTM), advanced traveler information systems, adaptive intersection control, routing, and traffic information services.

7. Conclusions

Human mobility is one of the essential human needs, and traffic congestion is a negative side effect of it. Traffic engineers use various approaches to solve or reduce this negative side effect. To achieve the best results, they need to accurately estimate the traffic flow performance or the level of congestion. For accurate estimation of the traffic flow performance, one of the approaches is to use data collected from multiple sensors and find a clever way to fusion this data. In this review, an overview of papers, where authors tried to determine the traffic state or predict congestion using multisource DF approach, is given. Additionally, a representation of different DF methods and techniques used for traffic congestion estimation is provided. This review aimed at making a basic analysis of recently used approaches in data fusion and providing guidelines to researchers in this field by helping them determine which approaches are most likely to provide good results. Dominant data fusion approaches, used from 2010 to 2020, are statistical analysis, and in recent approaches the most dominant are DNNs. It can be concluded that estimation of the traffic flow, generally, should be done using data from multiple different sources to provide resistance to possible biases some data collection technologies can impose. Another beneficial tool in the development of prediction methods, like data mining utilization, would be a standardized testing dataset from various multisource data, which would provide actual numerical evidence of how successful an approach is, given the available data.

Author Contributions

The conceptualization of the study was done by D.C., M.M., L.T., and N.J., who also did the funding acquisition. The writing of the original draft and preparation of the paper was done by D.C., and M.M. All authors contributed to the writing review and final editing. The supervision was done by N.J. All authors have read and agreed to the published version of the manuscript.

Funding

This work has been financially supported by the University of Zagreb and Faculty of Transport and Traffic Sciences.

Acknowledgments

Leo Tišljarić acknowledges the support the European Regional Development Fund under the grant KK.01.1.1.01.0009 (DATACROSS).

Conflicts of Interest

The authors declare no conflict of interest. The funding institutions had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

References

Qing, O. Fusing Heterogeneous Traffic Data: Parsimonious Approaches using Data-Data Consistency; Netherlands TRAIL Research School: Delft, The Netherlands, 2011. [Google Scholar]
Dailey, D.J.; Harn, P.; Lin, P.-J. ITS Data Fusion; Washington State Transportation Center: Washington, DC, USA, 1996. [Google Scholar]
Falcocchio, J.C.; Levinson, H.S. Road Traffic Congestion: A Concise Guide; Springer: Cham, Switzerland, 2015. [Google Scholar]
Ge, Y.-E.; Prentkovskis, O.; Tang, C.; Saleh, W.; Bell, M.G.H.; Junevičius, R. SOLVING TRAFFIC CONGESTION FROM THE DEMAND SIDE. Promet TrafficTransp. 2015, 27, 529–538. [Google Scholar] [CrossRef]
Anbaroglu, B.; Cheng, T.; Heydecker, B. Non-Recurrent Traffic Congestion Detection on Heterogeneous Urban Road Networks. Transp. A Transp. Sci. 2015, 11, 1–33. [Google Scholar] [CrossRef]
Reed, T.; Kidd, J. INRIX Global Traffic Scorecard; INRIX Research: London, UK, 2019. [Google Scholar]
HERE. Traffic Dashboard. 2019. Available online: https://www.here.com/en/vision/innovation/traffic-dashboard/ (accessed on 6 March 2019).
TomTom. Congestion Index. 2019. Available online: https://www.tomtom.com/en_gb/trafficindex/ (accessed on 6 March 2019).
Transportation Cost and Benefit Analysis II—Congestion Costs; Victoria Transport Policy Institute: Victoria, BC, Canada, 2018.
Dingil, A.E.; Schweizer, J.; Rupi, F.; Stasiskiene, Z. Transport indicator analysis and comparison of 151 urban areas, based on open source data. Eur. Transp. Res. Rev. 2018, 10, 58. [Google Scholar] [CrossRef]
Michael Thomson, J. Reflections on the Economics of Traffic Congestion. J. Transp. Econ. Policy 1994, 93–112. [Google Scholar]
Chin, A.T. Containing air pollution and traffic congestion: Transport policy and the environment in Singapore. Atmos. Environ. 1996, 30, 787–801. [Google Scholar] [CrossRef]
Hao, P.; Wang, C.; Wu, G.; Boriboonsomsin, K.; Barth, M. Evaluating Enviromental Impact of Traffic Congestion Based on Sparse Mobile Crowd-sourced Data. In Proceedings of the IEEE Conference on Technologies for Sustainability, Phoenix, AZ, USA, 12–14 November 2017. [Google Scholar]
Van Woensel, T.T.; Cruz, F. A stochastic approach to traffic congestion costs. Comput. Oper. Res. 2009, 36, 1731–1739. [Google Scholar] [CrossRef]
Ali, M.S.; Adnan, M.; Noman, S.M.; Baqueri, S.F.A. Estimation of Traffic Congestion Cost-A Case Study of a Major Arterial in Karachi. Procedia Eng. 2014, 77, 37–44. [Google Scholar] [CrossRef]
Jayasooriya, S.; Bandara, Y. Measuring the Economic costs of traffic congestion. In Proceedings of the 2017 Moratuwa Engineering Research Conference (MERCon), Moratuwa, Sri Lanka, 29–31 May 2017; pp. 141–146. [Google Scholar]
Bertini, R.L. You Are the Traffic Jam: Examination of Congestion Measures. In Proceedings of the 85th Annual Meeting of the Transportation Research Board, Washington, DC, USA, 22–26 January 2006. [Google Scholar]
Tišljarić, L.; Carić, T.; Abramović, B.; Fratrović, T. Traffic State Estimation and Classification on Citywide Scale Using Speed Transition Matrices. Sustainability 2020, 12, 7278. [Google Scholar] [CrossRef]
Hensher, D.A. Tackling road congestion—What might it look like in the future under a collaborative and connected mobility model? Transp. Policy 2018, 66, A1–A8. [Google Scholar] [CrossRef]
Giovanis, E. The relationship between teleworking, traffic and air pollution. Atmospheric Pollut. Res. 2018, 9, 1–14. [Google Scholar] [CrossRef]
Tišljarić, L.; Cvetek, D.; Muštra, M.; Jelušić, N. Mixed Impact Of The Covid-19 Pandemic and The Earthquake on Traffic Flow in The Narrow City Center: A Case Study for Zagreb-Croatia. In Proceedings of the Science and Development of Transport (ZIRP), Zagreb, Croatia, 29–30 August 2020; pp. 293–300. [Google Scholar]
Shao, J.; Yang, H.; Xing, X.; Yang, L. E-commerce and traffic congestion: An economic and policy analysis. Transp. Res. Part B Methodol. 2016, 83, 91–103. [Google Scholar] [CrossRef]
El Faouzi, N.-E.; Leung, H.; Kurian, A. Data fusion in intelligent transportation systems: Progress and challenges—A survey. Inf. Fusion 2011, 12, 4–10. [Google Scholar] [CrossRef]
Mihaylova, L.; Faouzi, E.; Klein, L. Sensor and Data Fusion: Taxonomy, Challenges and Applications. In Handbook on Soft Computing for Video Surveillance; Chapman and Hall: Boca Raton, FL, USA, 2012. [Google Scholar]
Zegras, C.; Pereira, F.; Amey, A.; Veloso, M.; Liu, L.; Bento, C.; Biderman, A. Data Fusion for Travel Demand Management: State of the Practice and Prospects. In Proceedings of the TDM’08, Travel Demand Management Symposium, Arlington, VA, USA, 15–16 November 2008; pp. 1–17. [Google Scholar]
El Faouzi, N.-E.; Klein, L.A. Data Fusion for ITS: Techniques and Research Needs. Transp. Res. Procedia 2016, 15, 495–512. [Google Scholar] [CrossRef]
May, A.D. Traffic Flow Fundamentals; Prentice Hall: Upper Saddle River, NJ, USA, 1990. [Google Scholar]
Lighthill, M.J.; Whitham, G.B. On kinematic waves I. Flood movement in long rivers. Proc. R. Soc. London. Ser. A Math. Phys. Sci. 1955, 229, 281–316. [Google Scholar] [CrossRef]
Richards, P.I. Shock Waves on the Highway. Oper. Res. 1956, 4, 42–51. [Google Scholar] [CrossRef]
Kerner, B.S. Experimental Features of Self-Organization in Traffic Flow. Phys. Rev. Lett. 1998, 81, 3797–3800. [Google Scholar] [CrossRef]
Kerner, B.S. The Physics of Traffic. Underst. Complex Syst. 2004, 12, 25–30. [Google Scholar] [CrossRef]
Kerner, B.S. Congested Traffic Flow: Observations and Theory. Transp. Res. Rec. J. Transp. Res. Board 1999, 1678, 160–167. [Google Scholar] [CrossRef]
Knoop, V.L. Introduction to Traffic Flow Theory: Theory and Exercises; TuDelft—Delft University of Technology: Delft, The Netehrlands, 2018. [Google Scholar]
Geroliminis, N.; Daganzo, C.F. Macroscopic modeling of traffic in cities. In Proceedings of the TRB 86th Annual Meeting, Washington, DC, USA, 21–25 January 2007; pp. 7–413. [Google Scholar]
Mohan Rao, A.; Ramachandra Rao, K. Measuring Urban Traffic Congestion—A Review. Int. J. Traffic Transp. Eng. 2012, 2, 286–305. [Google Scholar] [CrossRef]
Highway Research Board. Highway Capacity Manual; Transportation Research Board: Washington, DC, USA, 2000. [Google Scholar]
Ji, Y.; Xu, M.; Li, J.; Van Zuylen, H.J. Determining the Macroscopic Fundamental Diagram from Mixed and Partial Traffic Data. Promet—TrafficTransp. 2018, 30, 267–279. [Google Scholar] [CrossRef]
Carli, R.; Dotoli, M.; Epicoco, N. Monitoring traffic congestion in urban areas through probe vehicles: A case study analysis. Internet Technol. Lett. 2018, 1, e5. [Google Scholar] [CrossRef]
Tahmasseby, S. Traffic Data: Bluetooth Sensors vs. Crowdsourcing—A Comparative Study to Calculate Travel Time Reliability in Calgary, Alberta, Canada. J. Traffic Transp. Eng. 2015, 3. [Google Scholar] [CrossRef][Green Version]
Stipancic, J.; Miranda-Moreno, L.; Labbe, A. Measuring Congestion Using Large-Scale Smartphone-Collected GPS Data in an Urban Road Network. In Proceedings of the Conference and Exhibition of the Transportation Association of Canada, Toronto, ON, Candada, 25–28 September 2016. [Google Scholar]
Toledo, C.A.M. Congestion Indicators and Congestion Impacts: A Study on the Relevance of Area-wide Indicators. Procedia—Soc. Behav. Sci. 2011, 16, 781–791. [Google Scholar] [CrossRef][Green Version]
Antoniou, C.; Balakrishna, R.; Koutsopoulos, H.N. A Synthesis of emerging data collection technologies and their impact on traffic management applications. Eur. Transp. Res. Rev. 2011, 3, 139–148. [Google Scholar] [CrossRef]
Hazelton, M. Estimating vehicle speed from traffic count and occupancy data. J. Data Sci. 2004, 2, 231–244. [Google Scholar]
Bugdol, B.; Segiet, Z.; Krȩcichwost, M.; Kasperek, P. Vehicle detection system using magnetic sensors. Transp. Probl. 2014, 9, 49–60. [Google Scholar]
Lopez, A.A.; De Quevedo, A.D.; Yuste, F.S.; DeKamp, J.M.; Mequiades, V.A.; Cortes, V.M.; Cobena, D.G.; Pulido, D.M.; Urzaiz, F.I.; Menoyo, J.G. Coherent Signal Processing for Traffic Flow Measuring Radar Sensor. IEEE Sens. J. 2018, 18, 4803–4813. [Google Scholar] [CrossRef]
Araghi, B.N.; Krishnan, R.; Lahrmann, H. Mode-Specific Travel Time Estimation Using Bluetooth Technology. J. Intell. Transp. Syst. 2015, 20, 219–228. [Google Scholar] [CrossRef]
Yuan, J.; Yu, C.; Wang, L.; Ma, W. Driver Back-Tracing Based on Automated Vehicle Identification Data. Transp. Res. Rec. J. Transp. Res. Board 2019, 2673, 84–93. [Google Scholar] [CrossRef]
Barceló, J.; Montero, L.; Bullejos, M.; Serch, O.; Carmona, C. A Kalman Filter Approach for Exploiting Bluetooth Traffic Data When Estimating Time-Dependent OD Matrices. J. Intell. Transp. Syst. 2012, 17, 123–141. [Google Scholar] [CrossRef]
Kothuri, S.M.; Tufte, K.A.; Ahn, S.; Bertini, R.L. Using Archived ITS Data to Generate Improved Freeway Travel Time Estimates. In Proceedings of the TRB 86th Annual Meeting Compendium of Papers CD-ROM, Washington, DC, USA, 21–25 January 2007; p. 13. [Google Scholar]
Jeng, S.-T.; Chu, L. Vehicle Reidentification with the Inductive Loop Signature Technology. Proc. East. Asia Soc. Transp. Stud. 2013, 9, 1896–1915. [Google Scholar]
Cohen, S.; Christoforou, Z. Travel Time Estimation Between Loop Detectors and Fcd: A Compatibility Study on the Lille Network, France. Transp. Res. Procedia 2015, 10, 245–255. [Google Scholar] [CrossRef]
Sochor, J.; Spanhel, J.; Herout, A. BoxCars: Improving Fine-Grained Recognition of Vehicles Using 3-D Bounding Boxes in Traffic Surveillance. IEEE Trans. Intell. Transp. Syst. 2019, 20, 97–108. [Google Scholar] [CrossRef]
Zapletal, D.; Herout, A. Vehicle Re-identification for Automatic Video Traffic Surveillance. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Las Vegas, NV, USA, 26 June–1 July 2016; pp. 1568–1574. [Google Scholar] [CrossRef]
Li, G.; Song, H.; Liao, Z. An Effective Algorithm for Video-Based Parking and Drop Event Detection. Complexity 2019, 2019, 1–23. [Google Scholar] [CrossRef]
Horvat, R.; Kos, G.; Ševrović, M. Traffic flow modelling on the road network in the cities. Teh. Vjesn. 2015, 22, 475–486. [Google Scholar] [CrossRef]
Caceres, N.; Wideberg, J.; Benitez, F. Review of traffic data estimations extracted from cellular networks. IET Intell. Transp. Syst. 2008, 2, 179. [Google Scholar] [CrossRef]
Shah, D.; Kumaran, A.; Sen, R.; Kumaraguru, P. Travel Time Estimation Accuracy in Developing Regions: An Empirical Case Study with Uber Data in Delhi-NCR. In Proceedings of the Companion World Wide Web Conference, Association for Computing Machinery (ACM). San Francisco, NC, USA, 13–17 May 2019; pp. 130–136. [Google Scholar]
Araghi, B.N.; Pedersen, K.S.; Christensen, L.T.; Krishnan, R.; Lahrmann, H. Accuracy of Travel Time Estimation Using Bluetooth Technology: Case Study Limfjord Tunnel Aalborg. In Proceedings of the ITS World Congress, Vienna, Austria, 22–26 October 2012. [Google Scholar]
Krishnakumari, P.; Van Lint, H.; Djukic, T.; Cats, O. A data driven method for OD matrix estimation. Transp. Res. Procedia 2019, 38, 139–159. [Google Scholar] [CrossRef]
Zhao, Y.; Zheng, J.; Wong, W.; Wang, X.; Meng, Y.; Liu, H.X. Various Methods for Queue Length and Traffic Volume Estimation Using Probe Vehicle Trajectories. arXiv 2019, arXiv:1810.09237, 1–29. [Google Scholar] [CrossRef]
Tisljaric, L.; Erdelic, T.; Caric, T. Analysis of Intersection Queue Lengths and Level of Service Using GPS data. Int. Symp. ELMAR 2018, 43–46. [Google Scholar] [CrossRef]
Ramezani, M.; Geroliminis, N. Queue Profile Estimation in Congested Urban Networks with Probe Data. Comput. Civ. Infrastruct. Eng. 2015, 30, 414–432. [Google Scholar] [CrossRef]
Kumar, A.; Ross, C.; Karner, A.; Katyal, R. Crowdsourced Social Media Monitoring System Development; Georgia Institute of Technology: Atlanta, GA, USA, 2017. [Google Scholar]
Grumert, E.F.; Tapani, A. Traffic State Estimation Using Connected Vehicles and Stationary Detectors. J. Adv. Transp. 2018, 2018, 1–14. [Google Scholar] [CrossRef]
Van Eeen, A. You Only Look Twice: Rapid Multi-Scale Object Detection In Satellite Imagery. arXiv 2018, arXiv:1805.09512. [Google Scholar]
Li, J.; Chen, S.; Zhang, F.; Li, E.; Yang, T.; Lu, Z. An Adaptive Framework for Multi-Vehicle Ground Speed Estimation in Airborne Videos. Remote Sens. 2019, 11, 1241. [Google Scholar] [CrossRef]
Kim, E.-J.; Park, H.-C.; Ham, S.-W.; Kho, S.-Y.; Kim, D.-K. Extracting Vehicle Trajectories Using Unmanned Aerial Vehicles in Congested Traffic Conditions. J. Adv. Transp. 2019, 2019, 1–16. [Google Scholar] [CrossRef]
Guerrero-Ibáñez, J.; Zeadally, S.; Contreras-Castillo, J. Sensor Technologies for Intelligent Transportation Systems. Sensors 2018, 18, 1212. [Google Scholar] [CrossRef]
Wang, S.; Zhang, X.; Cao, J.; He, L.; Stenneth, L.; Yu, P.S.; Li, Z.; Huang, Z. Computing Urban Traffic Congestions by Incorporating Sparse GPS Probe Data and Social Media Data. ACM Trans. Inf. Syst. 2017, 35, 1–30. [Google Scholar] [CrossRef]
Bachmann, C.; Abdulhai, B.; Roorda, M.J.; Moshiri, B. A comparative assessment of multi-sensor data fusion techniques for freeway traffic speed estimation using microsimulation modeling. Transp. Res. Part C Emerg. Technol. 2013, 26, 33–48. [Google Scholar] [CrossRef]
Wang, S.; He, L.; Stenneth, L.; Yu, P.S.; Li, Z.; Huang, Z. Estimating Urban Traffic Congestions with Multi-sourced Data. In Proceedings of the 17th IEEE International Conference on Mobile Data Management (MDM), Porto, Portugal, 13–16 June 2016; Volume 1, pp. 82–91. [Google Scholar]
Kong, Q.-J.; Chen, Y.; Liu, Y. A fusion-based system for road-network traffic state surveillance: A case study of Shanghai. IEEE Intell. Transp. Syst. Mag. 2009, 1, 37–42. [Google Scholar] [CrossRef]
Wang, J.; Kong, X.; Xia, F.; Sun, L. Urban Human Mobility: Data-Driven Modeling and Prediction. ACM SIGKDD Explor. Newsl. 2019, 21, 1–19. [Google Scholar] [CrossRef]
Zhu, L.; Guo, F.; Polak, J.W.; Krishnan, R. Urban link travel time estimation using traffic states-based data fusion. IET Intell. Transp. Syst. 2018, 12, 651–663. [Google Scholar] [CrossRef]
Essien, A.; Petrounias, I.; Sampaio, P.; Sampaio, S. Improving Urban Traffic Speed Prediction Using Data Source Fusion and Deep Learning. In Proceedings of the IEEE International Conference on Big Data and Smart Computing (BigComp), Kyoto, Japan, 27 February–2 March 2019; pp. 1–8. [Google Scholar]
Croce, A.I.; Musolino, G.; Rindone, C.; Vitetta, A. Transport System Models and Big Data: Zoning and Graph Building with Traditional Surveys, FCD and GIS. ISPRS Int. J. Geo-Inf. 2019, 8, 187. [Google Scholar] [CrossRef]
Grau, J.M.S.; Mitsakis, E.; Tzenos, P.; Stamos, I.; Selmi, L.; Aifadopoulou, G. Multisource Data Framework for Road Traffic State Estimation. J. Adv. Transp. 2018, 2018, 1–9. [Google Scholar] [CrossRef]
Wu, C.; Thai, J.; Yadlowsky, S.; Pozdnoukhov, A.; Bayen, A. Cellpath: Fusion of Cellular and Traffic Sensor Data for Route Flow Estimation via Convex Optimization. Transp. Res. Procedia 2015, 7, 212–232. [Google Scholar] [CrossRef]
Wang, S.; Li, F.; Stenneth, L.; Yu, P.S. Enhancing Traffic Congestion Estimation with Social Media by Coupled Hidden Markov Model; Springer: Cham, Switzerland, 2016; pp. 247–264. [Google Scholar]
Patire, A.D.; Wright, M.; Prodhomme, B.; Bayen, A.M. How much GPS data do we need? Transp. Res. Part C Emerg. Technol. 2015, 58, 325–342. [Google Scholar] [CrossRef]
Jiang, Z.; Chen, X.M.; Ouyang, Y. Traffic state and emission estimation for urban expressways based on heterogeneous data. Transp. Res. Part D Transp. Environ. 2017, 53, 440–453. [Google Scholar] [CrossRef]
Ambühl, L.; Menendez, M. Data fusion algorithm for macroscopic fundamental diagram estimation. Transp. Res. Part C Emerg. Technol. 2016, 71, 184–197. [Google Scholar] [CrossRef]
Bhaskar, A.; Chung, E.; Dumont, A.-G. Fusing Loop Detector and Probe Vehicle Data to Estimate Travel Time Statistics on Signalized Urban Networks. Comput. Civ. Infrastruct. Eng. 2010, 26, 433–450. [Google Scholar] [CrossRef]
Zheng, Z.; Wang, C.; Wang, P.; Xiong, Y.; Zhang, F.; Lv, Y. Framework for fusing traffic information from social and physical transportation data. PLoS ONE 2018, 13, e0201531. [Google Scholar] [CrossRef]
Li, M.; Chen, X.M.; Ni, W. An Extended Generalized Filter Algorithm for Urban Expressway Traffic Time Estimation based on Heterogeneous Data. J. Intell. Transp. Syst. 2016, 20, 474–484. [Google Scholar] [CrossRef]
Ou, Q.; Van Lint, H.; Hoogendoorn, S.P. Fusing Heterogeneous and Unreliable Data from Traffic Sensors; Springer: Berlin/Heidelberg, Germany, 2010; Volume 281, pp. 511–545. [Google Scholar]
Okawa, M.; Iwata, T.; Kurashima, T.; Tanaka, Y.; Toda, H.; Ueda, N. Deep Mixture Point Processes: Spatio-temporal Event Prediction with Rich Contextual Information. arXiv 2019, arXiv:1906.08952. [Google Scholar]
Lederman, R.; Wynter, L. Real-time traffic estimation using data expansion. Transp. Res. Part B Methodol. 2011, 45, 1062–1079. [Google Scholar] [CrossRef]
Kong, Q.-J.; Li, Z.; Chen, Y.; Liu, Y. An Approach to Urban Traffic State Estimation by Fusing Multisource Information. IEEE Trans. Intell. Transp. Syst. 2009, 10, 499–511. [Google Scholar] [CrossRef]
Toole, J.L.; Colak, S.; Sturt, B.; Alexander, L.P.; Evsukoff, A.; González, M.C. The path most traveled: Travel demand estimation using big data resources. Transp. Res. Part C Emerg. Technol. 2015, 58, 162–177. [Google Scholar] [CrossRef]
Sohn, S.Y.; Lee, S.H. Data fusion, ensemble and clustering to improve the classification accuracy for the severity of road traffic accidents in Korea. Saf. Sci. 2003, 41, 1–14. [Google Scholar] [CrossRef]
Choi, K.; Chung, Y. A Data Fusion Algorithm for Estimating Link Travel Time. ITS J. 2002, 7. [Google Scholar] [CrossRef]
Mil, S.; Piantanakulchai, M. Modified Bayesian data fusion model for travel time estimation considering spurious data and traffic conditions. Appl. Soft Comput. 2018, 72, 65–78. [Google Scholar] [CrossRef]
Chou, C.-H.; Huang, Y.; Huang, C.-Y.; Tseng, V.S. Long-Term Traffic Time Prediction Using Deep Learning with Integration of Weather Effect; Springer: Cham, Switzerland, 2019; pp. 123–135. [Google Scholar]
Rodrigues, F.; Markou, I.; Pereira, F.C. Combining time-series and textual data for taxi demand prediction in event areas: A deep learning approach. Inf. Fusion 2019, 49, 120–129. [Google Scholar] [CrossRef]
Li, H.; Song, D.; Liu, Y.; Li, B. Automatic Pavement Crack Detection by Multi-Scale Image Fusion. IEEE Trans. Intell. Transp. Syst. 2019, 20, 2025–2036. [Google Scholar] [CrossRef]
Ke, X.; Shi, L.; Guo, W.; Chen, D. Multi-Dimensional Traffic Congestion Detection Based on Fusion of Visual Features and Convolutional Neural Network. IEEE Trans. Intell. Transp. Syst. 2018, 20, 2157–2170. [Google Scholar] [CrossRef]
Hu, Y.; Lu, M.; Lu, X. Driving behaviour recognition from still images by using multi-stream fusion CNN. Mach. Vis. Appl. 2019, 30, 851–865. [Google Scholar] [CrossRef]
Guan, D.; Cao, Y.; Yang, J.; Cao, Y.; Yang, M.Y. Fusion of multispectral data through illumination-aware deep neural networks for pedestrian detection. Inf. Fusion 2019, 50, 148–157. [Google Scholar] [CrossRef]
Luo, X.; Li, D.; Yang, Y.; Zhang, S. Spatiotemporal Traffic Flow Prediction with KNN and LSTM. J. Adv. Transp. 2019, 2019, 1–10. [Google Scholar] [CrossRef]
Havyarimana, V.; Xiao, Z.; Sibomana, A.; Wu, D.; Bai, J. A Fusion Framework Based on Sparse Gaussian–Wigner Prediction for Vehicle Localization Using GDOP of GPS Satellites. IEEE Trans. Intell. Transp. Syst. 2019, 21, 680–689. [Google Scholar] [CrossRef]
Gu, Y.; Lu, W.; Qin, L.; Li, M.; Shao, Z. Short-term prediction of lane-level traffic speeds: A fusion deep learning model. Transp. Res. Part C Emerg. Technol. 2019, 106, 1–16. [Google Scholar] [CrossRef]
Liang, X.; Du, X.; Wang, G.; Han, Z. A Deep Reinforcement Learning Network for Traffic Light Cycle Control. IEEE Trans. Veh. Technol. 2019, 68, 1243–1253. [Google Scholar] [CrossRef]
Guo, J.; Xie, Z.; Qin, Y.; Jia, L.; Wang, Y. Short-Term Abnormal Passenger Flow Prediction Based on the Fusion of SVR and LSTM. IEEE Access 2019, 7, 42946–42955. [Google Scholar] [CrossRef]
Karlaftis, M.; Vlahogianni, E. Statistical methods versus neural networks in transportation research: Differences, similarities and some insights. Transp. Res. Part C Emerg. Technol. 2011, 19, 387–399. [Google Scholar] [CrossRef]
Xu, M.; Dai, W.; Liu, C.; Gao, X.; Lin, W.; Qi, G.-J.; Xiong, H. Spatial-Temporal Transformer Networks for Traffic Flow Forecasting. arXiv 2020, arXiv:2001.02908, 1–14. [Google Scholar]

Table 1. Traffic Flow Parameters Used as Congestion Indicators.

Parameter		Definition	Units
Time Mean Speed	v_t	The arithmetic mean of instantaneous speeds for N vehicles passing an observed road section within a time period T	$\frac{m}{s}$
Space Mean Speed	v_s	The arithmetic mean of the instantaneous speeds of the N vehicles, which are at the observed road section d, at instant time t (near zero)	$\frac{m}{s}$
Flow Rate	$q$	The number of vehicles passing through a given road section, at a given time interval	$\frac{veh}{h}$
Density	$ρ$	The number of vehicles occupying a given length of a lane or road at an instant time t (near zero)	$\frac{veh}{km}$
Throughput	t_v	The throughput is the number of vehicle-kilometers driven for a given length of a road and for a given time period	$\frac{veh \times km}{h}$
Time headway	h_t	Time spacing between the front or back surfaces of the running vehicles in the traffic flow	$s$
Space headway	h_s	Spatial spacing between the front or back surfaces of the running vehicles in the traffic flow	$m$
Occupancy	O	The percentage of time at observed road section occupied by vehicles or a total vehicle’s dwell time in detection zone at observed interval T	%
Queue Length	L_q	Number of vehicles in the queue (intersection, ramp, etc.)	Number of vehicles or km
Travel Time	TT	Time needed for a vehicle to drive from one observed point to another in the traffic network	s

Table 2. Congestion Indexes or Indicators.

Parameter		Definition	Equation
Delay	d	The additional travel time experienced by a driver, difference between actual travel time TT and free-flow travel time $T T_{0}$	$d = T T - T T_{0}$
Travel Time Index	TTI	The ratio between delay $d$ and free-flow travel time $T T_{0}$	$T T I = \frac{d}{T T_{0}}$
Speed Reduction Index	SRI	The ratio between free flow speed $v_{0}$ and actual speed $v$ difference over free flow speed $v_{0}$	$S R I = \frac{v_{0} - v}{v_{0}}$
Buffer Index	BI	The extra time that travelers must add to their average travel time when planning trips to ensure on-time arrival	$B I = \frac{d}{T T_{m e a n}}$
Travel Rate Index	TRI	The additional time that is required to make a trip because of congested conditions on the roadway.	$T R I = \frac{T T_{p e a k - h o u r}}{T T_{n o n - p e a k}}$
Proportion Stopped Time	PST	The ratio of stopped time T_s to the total journey time T_r (running time)	$P S T = \frac{T_{s}}{T_{s} + T_{r}}$
Acceleration Noise	AN	Induce fluctuation in speed where $Δ t_{i}$ is the time interval taken for a speed change $Δ v_{i}$ and T_r is vehicle running time	$A N = \sqrt{\frac{1}{T_{r}} \sum_{i = 1}^{N} \frac{Δ v_{i}^{2}}{Δ t_{i}}}$
Acceleration Noise Index	ANI	The ratio between actual acceleration noise and acceleration noise in a free flow condition	$A N I = \frac{A N}{A N_{0}}$

Table 3. Advantages and Disadvantages of Data Collection Technologies.

Point sensors	Technologies	Advantages	Disadvantages
	Inductive loops	- provide basic traffic parameters (e.g., volume, occupancy, speed, presence, headway) - well-defined detection zone - well-known technology - accurate and reliable traffic data - negligible influence of weather conditions	- installation requires pavement cut and lane closure - spatial coverage is limited - implementation and maintenance costs are high - lifetime depends on pavement quality
	Video detection	- can provide the largest set of data - feasible integration of traffic collection and traffic supervision - can replace several loops - non-intrusive sensor—no pavement cut needed	- affected by weather conditions - calibration issue - cover occurrence
	Radar sensors	- provide speed, vehicle counts, vehicle classification - is not affected by weather conditions - multiple detection zone	- susceptibility to electromagnetic interferences - cover occurrence
	Acoustic sensors	- multiple lane operation available - passive detection - record vehicle’s passage, presence, and speed	- spatial coverage is limited - high costs for setting up and maintaining - unsuitable for urban areas with dense traffic
	Infrared sensors	- multiple detection zone - small impact of weather conditions	- spatial coverage is limited (depends on sensor type)
	Magnetic sensors	- not affected by weather conditions - can be used where loops are not feasible (e.g., bridge decks)	- spatial coverage is limited
	Piezoelectric sensors	- some models and configurations provide weight in motion and speed	- placed in groove along roadway surface - high costs for setting up and maintaining
Point-to-points sensors	Bluetooth detectors	- can provide travel time, O-D matrices - easy mounting - far greater privacy than ALPR - low energy consumption	- cannot provide volume and vehicle count - low detection accuracy
	Wi-Fi detectors	- easy mounting - suitable for passenger detection - low-cost components	- cannot provide accurate basics traffic parameters
	RFID detectors	- low-cost components - high detection accuracy	- cannot provide volume and vehicle count - small detection zone
	ALPR detectors	- can provide volume, O-D matrices and travel time - high detection accuracy	- privacy issue problematic data protection
Area-wide sensors	FCD	- potential for real-time monitoring - large-scale spatial coverage - location precision is high (10 m) - cost-effective source of data	- limited sample size and time-spatial coverage - high equipment costs
	CFCD	- no additional device is needed - large number of potential probes cost-effective source of data	- for extract data sophisticated algorithms are needed - location precision is low (depends on used location methods and size of mobile network cells) - limited and imprecise spatial coverage
	Airborne imaginary	- mobile multifunctional detection device - can provide density	- limited recording time - affected by weather conditions - high costs
	Social media data	- cheapest data in terms of data availability - potential for real-time data	- low reliability caused by human factor

Table 4. Multisource Data Used for Fusion and Congestion Estimation.

Reference	Point Sensors			Point-to-Point Sensors	Area-Wide Sensors			Auxiliary Information
Reference	Inductive Loop	Video and Image	Radar Data	Bluetooth Data	GNSS Data	Social-Media Data	FCD Cellular Data	Social Events	Weather Information	Point of Interest	Road Physical
Wang et al. [69,71,79]					X	X		X	X	X	X
Zhu et al. [74]	X				X		X
Ji et al. [37], Patire et al. [80], Jiang et al. [81], Ambühl et al. [82], Bachmann et al. [44], Bhaskar [83], Kong et al. [72]	X				X
Essien et al. [75]	X								X
Salanova et al. [36]	X	X	X	X	X
Zheng et al. [84], Yuan et al. [46]					X	X
Li et al. [85]	X		X		X
Bachmann et al. [70]	X			X	X
Wu et al. [78]	X						X

Table 5. Representation of Methods for Data Fusion (DF) (MF, TF–Matrix and Tensor Factorization; STAT—Statistical; ANN—Artificial Neural Network; MM—Markov Model; KF—Kalman Filter; IP—Image Processing; DNN—Deep Neural Network; CLUS—Clustering; OPT—Optimization; FUZ—Fuzzy; CLA—Classification).

	TF	MF	STAT	ANN	MM	KF	IP	DNN	CLUS	OPT	FUZ	CLA
Wang et al. [71,97]	X 2017	X 2016
Zhu et al. [91]			X 2018	X 2018
Wang et al. [79]			X 2016		X 2016
Ji et al. [37], Ambühl et al. [82], Bhaskar [83], Li et al. [85], Havyarimana et al. [101]			X 2018, 2016, 2010, 2016, 2020
Patire et al. [80], Kong et al. [72]			X 2015, 2009			X 2015, 2009
Jiang et al. [81]			X 2017				X 2017
Essien et al. [75], Yuan et al. [46], Chou et al. [95], Rodrigues et al. [96], Gu et al. [102], Liang et al. [103]								X 2019
Zheng et al. [84]			X 2018						X 2018
Wu et al. [78]										X 2015
Choi et al. [92]			X 2002								X 2002
Sohn et al. [91]			X 2003	X 2003					X 2003			X
Mil et al. [93]			X 2018									X 2018
Li et al. [96]			X 2019				X 2019
Luo et al. [100], Guo et al. [104]			X 2019					X 2019	X 2019
Ke et al. [97], Hu et al. [98]							X 2019	X 2019
Guan et al. [99]								X 2019				X 2019
Usage of method	2, 4%	2, 4%	17, 34%	2, 4%	1, 2%	2, 2%	4, 8%	11, 22%	4, 8%	1, 2%	1, 2%	3, 6%

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Cvetek, D.; Muštra, M.; Jelušić, N.; Tišljarić, L. A Survey of Methods and Technologies for Congestion Estimation Based on Multisource Data Fusion. Appl. Sci. 2021, 11, 2306. https://doi.org/10.3390/app11052306

AMA Style

Cvetek D, Muštra M, Jelušić N, Tišljarić L. A Survey of Methods and Technologies for Congestion Estimation Based on Multisource Data Fusion. Applied Sciences. 2021; 11(5):2306. https://doi.org/10.3390/app11052306

Chicago/Turabian Style

Cvetek, Dominik, Mario Muštra, Niko Jelušić, and Leo Tišljarić. 2021. "A Survey of Methods and Technologies for Congestion Estimation Based on Multisource Data Fusion" Applied Sciences 11, no. 5: 2306. https://doi.org/10.3390/app11052306

APA Style

Cvetek, D., Muštra, M., Jelušić, N., & Tišljarić, L. (2021). A Survey of Methods and Technologies for Congestion Estimation Based on Multisource Data Fusion. Applied Sciences, 11(5), 2306. https://doi.org/10.3390/app11052306

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Survey of Methods and Technologies for Congestion Estimation Based on Multisource Data Fusion

Abstract

1. Introduction

2. Background

2.1. Review Methodology

2.2. Congestion Definition

2.3. Congestion Problem and Impact on Society

2.4. Data Fusion in ITS

3. Quantitative Congestion Indicators

4. Data Collection Technologies

5. Representation of Methods Used in Data Fusion

5.1. Statistic Methods

5.2. DNN Methods

6. Discussion

7. Conclusions

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI