Heterogeneous Data Fusion Method to Estimate Travel Time Distributions in Congested Road Networks

Travel times in congested urban road networks are highly stochastic. Provision of travel time distribution information, including both mean and variance, can be very useful for travelers to make reliable path choice decisions to ensure higher probability of on-time arrival. To this end, a heterogeneous data fusion method is proposed to estimate travel time distributions by fusing heterogeneous data from point and interval detectors. In the proposed method, link travel time distributions are first estimated from point detector observations. The travel time distributions of links without point detectors are imputed based on their spatial correlations with links that have point detectors. The estimated link travel time distributions are then fused with path travel time distributions obtained from the interval detectors using Dempster-Shafer evidence theory. Based on fused path travel time distribution, an optimization technique is further introduced to update link travel time distributions and their spatial correlations. A case study was performed using real-world data from Hong Kong and showed that the proposed method obtained accurate and robust estimations of link and path travel time distributions in congested road networks.


Introduction
Accurate and robust estimation of travel time distribution, including mean and variance, is a crucial requirement for advanced traveler information systems (ATIS). Provision of travel time distribution information through ATIS enables travelers to make reliable path choice decisions, ensuring a higher chance of on-time arrival [1][2][3]. The provided distribution information also allows operators to evaluate network performance and reliability, and identify bottlenecks for proactively deploying effective controls to improve overall traffic conditions [4,5].
Recent advances in information and communication technologies (ICTs) have produced a variety of spatiotemporal big data for travel time estimation [6]. Existing data collection techniques could be classified into point detection, interval detection, and floating car systems [7,8]. Point detectors (such as loop detectors and video image detectors) are generally deployed at specific road segment locations, to collect vehicle point speeds. Interval detectors consist of a pair of devices deployed in road networks to directly calculate travel times between the device pair. Typical interval detectors include automatic vehicle identification (AVI), Bluetooth, and license plate recognition devices. In contrast

Literature Review
Travel time estimations have been intensively studied for over three decades. Early studies proposed various methods to estimate deterministic travel times, e.g., mean travel time, using a single data source [16][17][18][19][20]. A complete survey of these methods is beyond the scope of this paper; interested readers can refer to comprehensive reviews by Mori et al. and Vlahogianni et al. [8,21].
In the last decade, many research efforts have focused on data fusion techniques to enhance accuracy and robustness of deterministic travel time estimation using multiple data sources. Current data fusion techniques can be broadly classified into statistical, artificial cognition, and probabilistic-based techniques [22]. Statistical based techniques, such as simple convex combination algorithms, use statistical information of data quality to determine weights for different data sources [16,23]. They are relatively widely used due to their simple implementation. However, they are not well suited to fuse different data sources, which are inconsistent and even conflicting. Artificial cognition based techniques combine multiple data sources using artificial intelligence techniques, such as neural networks or genetic algorithms [11]. They can tackle complex data fusion problems, but require large datasets for training, which are generally infeasible for many real-world applications. Probabilistic based techniques typically employ Bayesian and/or D-S evidence theory to provide mathematical reasoning rules for fusing inaccurate and inconsistent data from multiple sources [24][25][26]. The D-S evidence theory can be regarded as a generalization of Bayesian theory without the requirement of prior knowledge. Nevertheless, most existing studies using D-S evidence theory are restricted to estimation of traffic states (i.e., very congested, congested, medium, smooth or very smooth) rather than precise numerical values of travel times [27][28][29].
Since no single data source covers the whole network, research efforts have also investigated missing data imputing techniques to enhance data completeness. Missing data imputation can be broadly classified as prediction and spatial interpolation based techniques. Prediction-based techniques adopt travel time prediction models, such as K-nearest neighbors, kernel regression, and autoregressive integrated moving average, to forecast the missed data from historical data [30][31][32]. Spatial interpolation techniques impute missing data of a link by using established statistical relationships between the link and its adjacent links [14][15][16]33]. For all techniques in both categories, incorporation of travel time correlations is well recognized as an effective way to improve imputation performance [32]. However, most missing data imputing techniques assume fixed travel time correlations, which are inadequate to represent the dynamic nature of traffic conditions.
The above studies focused on estimating only deterministic travel times, while ignoring travel time variances. Recent attention has investigated methods to estimate travel time distributions (including means and variances) using a single data source. Dion and Rakha [34] proposed an exponential smoothing filter to estimate travel time distributions using interval detector data. Jenelius and Koutsopoulos [35] developed a maximum likelihood approach to estimate travel time distributions using floating car data. Rahmani et al. [36] used the same data type and proposed a nonparametric approach to estimate travel time distributions. Hans et al. [37] used point detector data and proposed a kinematic wave approach for estimating travel time distributions at signalized intersections. Accurate estimation of travel time distributions is more challenging, because more data with higher quality are required to estimate reliable mean and variance information. Including multiple data sources is obviously beneficial for accurate and robust estimations of travel time distributions, but to the best of our knowledge, few studies have been done for estimating travel time distributions by fusing multiple data sources.
To fill this gap, the current study proposes a heterogeneous data fusion method for estimating travel time distributions, fusing interval and point detector data. In the proposed method, link travel time distributions are first estimated from point detector observations. The spatially missing data issue of point detectors is addressed. Travel time distributions of links without point detectors are imputed based on their spatial correlations with links with point detectors. Estimated link travel time distributions from point detector data are then fused with path travel time distributions obtained from interval detectors. To fuse these two path distributions, a D-S distribution fusion algorithm is proposed, built on D-S evidence theory. An optimization technique is further introduced to update link travel time distributions and their spatial correlations according to the fused travel time distribution.

Brief Introduction of the D-S Evidence Theory
The D-S evidence theory was initially developed by Dempster [38], and later extended and refined by Shafer [39]. This theory can be regarded as a generalization of Bayesian inference to tackle uncertainty reasoning based on incomplete information [40,41]. In contrast to Bayesian inference, D-S evidence theory does not assign priori probabilities to unknown propositions or states [42]. Probabilities are assigned only when supporting evidence is available [43]. This provides a flexible framework for decision making by combining cumulative evidence, and has broad applications in many areas, such as expert systems [40,44], artificial intelligence [45,46], false diagnosis [47,48], target recognition [49][50][51], decision-making [52], information fusion [53], etc.
Let Ω = {S 1 , . . . , S n } be a collectively exhaustive and mutually exclusive set of states, which is also called the frame of discernment. This frame of discernment contains every possible state of a system. A basic probability assignment (BPA) (also called a belief structure) is a function m : 2 Ω → [0, 1] , satisfying m(φ) = 0 and ∑ ∀A⊂Ω m(A) = 1, where A is a subset of Ω; and 2 Ω = {A|A ∈ Ω} is the power set of Ω consisting of all the subsets of Ω. The assigned probability m(A) measures the belief exactly assigned to A. All the assigned probabilities sum to unity and there is no belief in the empty set φ. For notational consistency, boldfaced letters represent vectors or matrixes throughout the paper. Multiple independent evidence can be fused using the traditional Dempster's combination rule [43][44][45][46][47][48]51,52]. With BPAs of two independent evidences, m 1 and m 2 , the combination rule is defined as: where η is the conflict factor, which ranges from 0 to 1 and represents the degree of total conflict between evidences m 1 and m 2 . 1/(1 − η) is the normalization factor which ensures the sum of BPAs can be unit. η is given by: Dempster's combination rule, Equation (1), provides effective reasoning rules for fusing low and moderate conflict evidences. However, in case of high or complete conflict evidences (i.e., η value approach to 1), traditional D-S evidence theory may lead to unreasonable synthesis results. To reduce the degree of evidence conflict, an effective method is to modify the evidence. A common technique [54,55] is to introduce an unknown state, Θ, into the frame of discernment as Ω = {Ω, Θ}, where Θ represents the unknown part of the evidence.
As an alternative, several researchers argued that high conflict are mainly caused by unreliable evidences; and thereby they proposed methods to identify and correct the unreliable evidences before the combination [48,51,56]. Overall, the D-S evidence theory provides mathematical reasoning rules for fusing inaccurate and incomplete data from multiple sources. In Section 4.2.2, the D-S evidence theory is employed to fuse travel time distributions from different data sources, which may be high conflict or even complete conflict.

Problem Statement
Let G = (N, E) be a directed network consisting of a set of nodes, N, and a set of links, E. A link a ij ∈ E is defined to be the road section between two adjacent nodes with n i ∈ N and n j ∈ N. Travel time of the link is a random variable, T ij , with mean and standard deviation (STD) t ij and σ ij , respectively. The vector of mean travel times for all links is t = [. . . , t ij , . . .] T , and the variance-covariance matrix between all links is K. The matrix K is the variance-covariance matrix of link travel times. In the variance-covariance matrix K, elements along the diagonal are the variance of link travel times, and off-diagonal elements are the travel time covariance between two links.
Let p od be a path between starting and ending nodes, n o and n d , respectively, consisting of λ consecutive links. Let x od ij be a link path incidence variable, where x od ij = 1 means that link a ij is on p od and x od ij = 0 otherwise. Let X = [. . . , x od ij , . . .] T be the vector of link path incidence variables. The path travel time distribution, T od , can be calculated by summing link travel times along the path, Let t od and σ od be the mean and variance of the path travel time distribution, respectively, then: To obtain travel time distribution information, many detectors of different types may be deployed in the network, as shown in Figure 1 for a simple network with interval and point detectors. A pair of interval detectors, e.g., AVI devices, are installed at n o and n d of p od to record the set of vehicles  The path travel time of each recorded vehicle can be obtained by the time difference  from entering to leaving the path, and path travel time distribution can be directly estimated from  this data, denoted as T od int . However, the detailed travel time distributions of all links along the path are unknown and the interval detector data covers only a portion of vehicles with relatively poor temporal sampling. To obtain travel time distribution information, many detectors of different types may be deployed in the network, as shown in Figure 1   Point detectors, e.g., loop detectors, are generally deployed for a subset of network links in real applications, e.g., Obviously, interval and point detector data have distinct spatial and temporal characteristics. Fusing heterogeneous data from both interval and point detectors could be beneficial for estimating travel time distributions for the path and all links either with or without point detectors.

Proposed Heterogeneous Data Fusion Method
This section presents the proposed heterogeneous data fusion method to estimate travel time distributions for the path and all links either with or without point detectors. Empirical studies have found that travel times can be well represented by either normal, gamma, or lognormal distributions [10,39]. Therefore, to simplify the problem and present the essential concept, it is assumed that all link and path travel time distributions follow the normal distribution [57,58]. Using this normality assumption, the proposed method is to estimate the mean and STD of travel time distributions of the path and all links. Figure 2 shows that the framework of the proposed heterogeneous data fusion method consists of three steps, described in detail in the following sections. The first step, called data preprocessing, is to estimate path travel time distributions from interval and point detector data, respectively. The second step, called distribution fusion, is to fuse the estimated path travel time distributions by using D-S evidence theory. The last step, called posterior update, is to update link travel time distributions and their travel time correlations based on the fused distribution. Point detectors, e.g., loop detectors, are generally deployed for a subset of network links in real applications, e.g., a r o1 and a r 23 in Figure 1, whereas other links, e.g., a e 12 , a e 34 , and a e 4d , are without detectors. Thus, only travel time distributions of links with point detectors, e.g., T r o1 and T r 23 , can be directly estimated, while travel time distributions of links without point detectors are unknown, e.g., T e 12 , T e 34 , and T e 4d . Nevertheless, the point detector data tend to have good temporal sampling, since these detectors generally can collect the speeds of all vehicles passing through them.
Obviously, interval and point detector data have distinct spatial and temporal characteristics. Fusing heterogeneous data from both interval and point detectors could be beneficial for estimating travel time distributions for the path and all links either with or without point detectors.

Proposed Heterogeneous Data Fusion Method
This section presents the proposed heterogeneous data fusion method to estimate travel time distributions for the path and all links either with or without point detectors. Empirical studies have found that travel times can be well represented by either normal, gamma, or lognormal distributions [10,39]. Therefore, to simplify the problem and present the essential concept, it is assumed that all link and path travel time distributions follow the normal distribution [57,58]. Using this normality assumption, the proposed method is to estimate the mean and STD of travel time distributions of the path and all links. Figure 2 shows that the framework of the proposed heterogeneous data fusion method consists of three steps, described in detail in the following sections. The first step, called data preprocessing, is to estimate path travel time distributions from interval and point detector data, respectively. The second step, called distribution fusion, is to fuse the estimated path travel time distributions by using D-S evidence theory. The last step, called posterior update, is to update link travel time distributions and their travel time correlations based on the fused distribution. To obtain travel time distribution information, many detectors of different types may be deployed in the network, as shown in Figure 1   Point detectors, e.g., loop detectors, are generally deployed for a subset of network links in real applications, e.g., Obviously, interval and point detector data have distinct spatial and temporal characteristics. Fusing heterogeneous data from both interval and point detectors could be beneficial for estimating travel time distributions for the path and all links either with or without point detectors.

Proposed Heterogeneous Data Fusion Method
This section presents the proposed heterogeneous data fusion method to estimate travel time distributions for the path and all links either with or without point detectors. Empirical studies have found that travel times can be well represented by either normal, gamma, or lognormal distributions [10,39]. Therefore, to simplify the problem and present the essential concept, it is assumed that all link and path travel time distributions follow the normal distribution [57,58]. Using this normality assumption, the proposed method is to estimate the mean and STD of travel time distributions of the path and all links. Figure 2 shows that the framework of the proposed heterogeneous data fusion method consists of three steps, described in detail in the following sections. The first step, called data preprocessing, is to estimate path travel time distributions from interval and point detector data, respectively. The second step, called distribution fusion, is to fuse the estimated path travel time distributions by using D-S evidence theory. The last step, called posterior update, is to update link travel time distributions and their travel time correlations based on the fused distribution.

Data Preprocessing Step
This step estimates path travel time distributions from interval and point detector data, independently. The path travel time distribution, T od int , can be directly estimated from interval detector data. Since interval detectors only record a set of vehicles equipped with electronic tags, the limited sample size becomes a critical issue in the estimation, especially for low market penetration applications. Outlier observations can also significantly affect path travel time distribution accuracy, e.g., some vehicles may make stops or detours along the path, leading to atypical travel time observations. To estimate path travel time distribution from interval detector data, the data filtering algorithm proposed by Dion and Rakha [34] was adopted in this study. This data filtering algorithm utilizes a series of low pass filters to remove outlier observations outside a dynamically varying validity window. Such algorithms can perform well in both stable and unstable traffic conditions at low levels of market penetration; and have been successfully applied in the real-time traveler information system (RTIS) in Hong Kong [14]. Thus, an accurate and robust estimation of mean, t od int and STD, σ od int of path travel time distribution can be obtained from interval detector data.
As discussed above, path travel time distribution cannot be directly estimated through point detector data, because only a few links are deployed with point detectors. To estimate the path travel time distributions, links are divided into links with and without point detectors, so that the vector of mean travel time comprises two parts t poi = [t r poi , t e poi ] T , where t r poi and t e poi are mean travel times for links with and without point detectors, respectively, at time interval . The variance-covariance matrix can be divided into four sub-matrixes, K poi = K rr poi K er poi K re poi K ee poi , where K rr poi is the variance-covariance matrix for links with point detectors; K ee poi is the variance-covariance matrix for links without point detectors; K er poi is the covariance matrix between links without and with point detectors; and K re poi = (K er poi ) T is the covariance matrix between links with and without point detectors. Let v r poi and v e poi be vectors of travel time variances for links with and without point detectors, respectively. They are elements along the diagonal of K rr poi and K ee poi , respectively. For a link a i r with a point detector, its mean, t i r , and STD, σ i r , of the link travel time distribution can be obtained from the collected data at the current time interval , i.e., t r poi and K rr poi can be determined from the point detector data. However, mean travel times for links without point detectors, t e poi , should be indirectly estimated. Following Tam and Lam [14], t e poi is estimated using spatial correlations between links with and without point detectors: where t r, −1 upd and t e, −1 upd are mean travel times for the links with and without point detectors estimated at the previous time interval − 1, respectively; K er, −1 upd is the covariance matrix between links without and with point detectors estimated at the previous time interval − 1; and (K rr poi ) −1 is the inverse of K rr poi . Similar to Equation (6), v e poi in this study was also indirectly estimated using the spatial correlations between links with and without point detectors: where v r, −1 upd and v e, −1 upd are travel time variances of the links with and without point detectors at the previous time interval − 1, respectively. Therefore, elements along the diagonal of K ee poi and all elements of K rr poi are estimated in the current time interval . It is assumed that (k ee poi ) ij = (k ee, −1 upd ) ij , ∀i = j and K er poi = K er, −1 upd , which means that off-diagonal elements of K ee poi and all elements of K er poi are the same as corresponding elements at the previous interval, − 1. These two matrixes, K ee poi and K er poi , will be updated in the posterior update step in Section 4.2.3. After t poi = [t r poi , t e poi ] T and K poi = K rr poi K er poi K re poi K ee poi are determined, the mean, t od poi , and STD, σ od poi , of the path travel time distribution, can be calculated. The vector of link path incidence variables is divided into two groups as X = [X r poi , X e poi ] T , where X r poi and X e poi are link path incidence variables for links with and without point detectors, respectively. Then, Equations (4)-(7) for calculating t od poi and σ od poi can be expressed as: (8) σ od poi = (X r poi ) T K rr poi X r poi + (X e poi ) T K ee poi X e poi + 2(X e poi ) T K er poi X r poi (9) Substituting Equation (6) into Equation (8), the mean travel time can be expressed as: Therefore, the path travel time distribution, T od poi , can be determined from point detector data.

Distribution Fusion Step
This step fuses two path travel time distributions, T od int and T od poi , estimated from interval and point detectors, respectively. A fusion algorithm is proposed built on the D-S evidence theory. In this study, the frame of discernment, Ω, is defined as a set of mutually exclusive travel time ranges, , is defined by a lower bound l i and upper bound u i .The mean travel time for range S i can be expressed as: Path travel time distributions estimated by interval and point detectors can be regarded as two independent sets of evidence. Based on the defined travel time ranges, these two path travel time distributions are discretized to obtain corresponding discrete distributions, i.e., histograms, as illustrated in Figure 3a. The resultant discrete distributions, m int and m poi , are respectively modelled as BPAs for T od int and T od poi . Then, m * (S i ) (either m int (S i ) or m poi (S i )) represents the corresponding probability of travel time range S i , and can be expressed as: (12) where f * (x) is the probability density function of T od * (either T od int or T od poi ). When path travel time distributions follow normal distributions, m * (S i ) can be expressed as: where Φ snd (·) represents the cumulative distribution function (CDF) of the standard normal distribution. In the literature, Hart's formula [59] is a good numerical approximation approach to calculate Φ snd (·): Clearly, BPAs, m * , satisfies m * (φ) = 0 and ∑ ∀S i ⊂Ω m * (S i ) = 1.    (5, 8), (8,11), (11,14), (14,17) and (17,20) which constitute the frame of discernment . The corresponding BPAs of two path travel time distributions are shown in Table 1. Figure 3a shows Case 1 that the two evidences have high belief level and low conflict degree, with a large portion of histogram coverage. Figure 3b shows Case 2 that two evidences have low belief level and high conflict degree, with only a small portion of histogram coverage. Figure 3c shows Case 3 that the two evidences are completely conflicted without histogram coverage.   Table 1 shows the results of evidence fusion by using Dempster's combination rule, Equation (1). As shown, this combination rule can provide a good estimation of path travel time distribution for Case 1 with a low conflict factor,    Figure 3 illustrates three typical situations of evidence conflict, representing the relationships between interval detector and point detector. Two path travel time are discretized into five travel time ranges as (5,8), (8,11), (11,14), (14,17) and (17,20) which constitute the frame of discernment Ω = {S 1 , S 2 , S 3 , S 4 , S 5 }. The corresponding BPAs of two path travel time distributions are shown in Table 1. Figure 3a shows Case 1 that the two evidences have high belief level and low conflict degree, with a large portion of histogram coverage. Figure 3b shows Case 2 that two evidences have low belief level and high conflict degree, with only a small portion of histogram coverage. Figure 3c shows Case 3 that the two evidences are completely conflicted without histogram coverage.   Table 1 shows the results of evidence fusion by using Dempster's combination rule, Equation (1). As shown, this combination rule can provide a good estimation of path travel time distribution for Case 1 with a low conflict factor, To reduce the degree of data conflict, the generalized combination rule, Equation (2), is adopted in this study, by introducing the unknown state into the frame of discernment, Ω = {S 1 , . . . , S i , . . . , S n , Θ}. Subsequently, to construct BPA m * (either m int or m poi ), a pre-defined small probability α Θ = m * (Θ), (e.g., α Θ = 0.05), is set for the unknown state Θ. Then, the path travel time distribution between is the inverse CDF of the standard normal distribution (e.g., Φ −1 snd (0.025) = −1.96 and Φ −1 snd (0.975) = 1.96). High and complete conflict situations are usually due to various data quality from the different detectors. To differentiate data sources with varying quality, an information quality parameter [48] is adopted in this study to assign higher weighting to data sources with better information quality. Let w int and w poi be the information quality weights for the path travel time distribution from interval and point detectors respectively. In this study, w int and w poi are expressed as a function of sample size and travel time variance: where N int is the sample size collected by interval detectors; N poi is the average sample size for all point detectors along the path; β int and β poi are sensitivity parameters for interval and point detectors, respectively, which should be calibrated independently. Other types of information quality function could also be used in practice.
Applying different weightings w int and w poi , the BPA m * (either m int or m poi ) is adjusted using following formula [48]: where w max = max(w int , w poi ) is the larger between w int and w poi . Substituting the adjusted BPAs into Equations (1) and (2), the fused BPA, m f us , can be determined following the generalized combination rule: Table 2 illustrates the distribution fusion built on the generalized combination rule using the same example as in Table 1. In this example, m int (Θ) = m poi (Θ) = 0.05 are set; and two BPAs, m int and m poi are modified to reflect this setting. Information quality parameters w int = 0.8 and w poi = 0.6 are used for interval and point detectors, respectively. All BPAs, m poi , for these cases are adjusted to The generalized combination rule, Equation (16), was adopted for fusing path travel time distribution.  Table 2 shows that the generalized combination rule provides a reasonable outcome for Case 1 (i.e., low conflict situation). More importantly, this generalized combination rule can well address the distribution fuse problem for Case 2 (i.e., the high conflict situation). Introducing Θ significantly reduced the conflict factor η to 0.6614. The probability of S 3 , which has little support from both evidence sets, is only slightly strengthened as m f us (S 3 ) = (0.075 × 0.056 + 0.075 × 0.288 + 0.05 × 0.056)/(1 − 0.6614) = 0.0874. The probabilities of other travel time ranges, S 1 , S 2 , S 4 , and S 5 , are reduced, but a higher weighting is given to the data source with better data quality (i.e. m int ). The generalized combination rule also addressed the distribution fusion for Case 3 (i.e., complete conflict situation), which cannot be fused using Dempster's combination rule.
From the fused BPA, m f us , the corresponding mean and STD, can be expressed as: where θ is the adjustment parameter to assign the probability of the unknown state to each travel time range. Thus, the proposed D-S distribution fusion algorithm can estimate path travel time distributions by fusing two path travel time distributions from interval and point detector data, even in the cases of extreme conflict between the data sets.

Posterior Update
Step ij is the element at row i and column j of K ee upd . This study uses t r upd = t r poi and K rr upd = K rr poi , because t r poi and K rr poi are directly obtained from point detector data and assumed to be accurate. Therefore, to update the link travel time covariance matrix, only K ee upd and K er upd sub-matrixes need to be updated, since K re upd = (K er upd ) T holds. Accordingly, the optimization problem of updating the spatial correlations is formulated as the following nonlinear programming problem: Subject to: t od f us =(X r poi ) T t r upd + (X e poi ) T t e, −1 upd + (X e poi ) T K er upd (K rr poi ) −1 (t r poi − t r, −1 upd ) (23) (σ od f us ) 2 =(X r poi ) T K rr poi X r poi + (X e poi ) T K ee upd X e poi + 2(X e poi ) T K er upd X r poi (24) The nonlinear programming M1 has a convex objective function and two linear constraints. To ensure K upd is stable over time, objective function (22) minimizes the total difference of updating elements in both K ee upd and K er upd sub-matrixes. Constraints (23) and (24), derived from Equations (9) and (10), ensure that the summation of means and variances of corresponding link travel time distributions are equal to that of the fused path travel time distribution, i.e., t od f us and (σ od f us ) 2 . This M1 problem is a typical quadratic programming problem. A unique solution can be determined using several efficient algorithms, such as the quadprog function in MatLab. Once K upd is determined, the vector of travel time means for links without point detectors, t e upd are updated as: t e upd = t e, −1 upd The updated t upd and K upd are used for estimating travel time distributions of links without point detectors in the subsequent time interval. The detailed steps of the Algorithm 1 are summarized as follows.

Algorithm 1
Step 1. Data preprocessing stage: Estimate T od int from interval detector data at current interval . Estimate t r poi and K rr poi for links with point detectors at current interval . Deduce t e poi and v e poi for links without point detectors using Equations (6) and (7). Estimate T od poi using Equations (9) and (10).
Step 2. Distribution fusion stage: Estimate T od f us by fusing T od int and based on Equations (11)-(21). Step 3. Posterior update stage: Update K upd using Equations (22)- (24); and update t upd using Equation (25). Set K −1 upd = K upd , and t −1 upd = t upd . Go to Step 1 for next time interval.

Numerical Experiments
Performance of the proposed heterogeneous data fusion method was investigated using real-world data from Hong Kong, as shown in Figure 4. A path from Aberdeen tunnel in Hong Kong Island to the Cross Harbor tunnel (CHT) in Kowloon urban area was selected for this case study. CHT is the most congested of the three tunnels connecting Kowloon urban area and Hong Kong Island. The total travel distance of the chosen path was 3.7 km with free-flow travel time 3.6 min. There were 11 links in the chosen path, with only two, Links 1 and 5, equipped with Autoscope video image detectors (VIDs), which is a popular type of point detector. Two AVI devices were installed at the beginning and end of the chosen path for automatic toll collection. Market penetration of AVI systems was approximately 40%. Real-time AVI data were also utilized for the implementation of RTIS (real-time traveler information systems) in Hong Kong [14]. Detailed information of this AVI system was provided in Tam and Lam [14].
Traffic data from both interval and point detectors were collected during (07:00-23:00) of a typical weekday: Wednesday, 20 August 2014. An offline link travel time covariance matrix was obtained from RTIS [14] as the initial K fus . To evaluate the performance of the proposed heterogeneous data fusion method, a manual license plate matching survey was performed. Video recording equipment was set at the starting and end nodes of the chosen path to record the license plate readings of vehicles. The vehicles recorded at the starting and end nodes were manually matched. Path travel times of matched vehicles were computed as ground truth for accuracy validation. implementation of RTIS (real-time traveler information systems) in Hong Kong [14]. Detailed information of this AVI system was provided in Tam and Lam [14]. Traffic data from both interval and point detectors were collected during (07:00-23:00) of a typical weekday: Wednesday, 20 August 2014. An offline link travel time covariance matrix was obtained from RTIS [14] as the initial fus K . To evaluate the performance of the proposed heterogeneous data fusion method, a manual license plate matching survey was performed. Video recording equipment was set at the starting and end nodes of the chosen path to record the license plate readings of vehicles. The vehicles recorded at the starting and end nodes were manually matched. Path travel times of matched vehicles were computed as ground truth for accuracy validation.

Evaluation Metrics
Two widely accepted metrics, mean absolute percentage error (MAPE) and root mean square error (RMSE), were adopted to evaluate the accuracy of the estimated mean of path travel time distributions:

Evaluation Metrics
Two widely accepted metrics, mean absolute percentage error (MAPE) and root mean square error (RMSE), were adopted to evaluate the accuracy of the estimated mean of path travel time distributions: (t od f us − t od obs ) 2 (27) where n is the number of time intervals during the period of interest, and t od obs is the ground truth observed mean travel time obtained from the field survey at time interval . Smaller MAPE t and RMSE t indicate higher accuracy of the estimated mean travel time.
The MAPE and RMSE concepts were extended to evaluate the accuracy of the estimate STD of the path travel time as: where σ od obs represents the ground truth observed travel time STD obtained from the field survey at time interval .
For many transportation applications, it is meaningful to construct a travel time interval at a given confidence level from the estimated travel time distribution [60,61]. The travel time interval accuracy represents the integrated accuracy of both the estimated mean and STD. Two metrics were adopted to evaluate these accuracies: probability outside the predicted (estimated) time interval (POPI) and probability outside the observed time interval (POOI) [62]. The POPI measures the percentage of observed data outside the estimated travel time interval, while the POOI measures the percentage of estimated distribution outside the observed travel time interval.
Let l f us = Φ −1 f us (α/2) and u f us = Φ −1 f us (1 − α/2) be the lower and upper bounds of the estimated travel time interval, respectively, at confidence level 1 − α, where Φ −1 f us (·) is the inverse CDF of the estimated path travel time distribution. Then: where Φ obs (·) is the CDF of the observed travel time distributions. The POPI value ranges from 0 to 1. The smaller POPI indicates capture of larger proportion of observed data, i.e., higher accuracy of the estimated travel time interval. As noted by Shi et al. [62], this POPI metric is very useful, but tends to exhibit bias for situations of wide travel time intervals due to large STD errors.
As an alternative, POOI metric is the percentage of estimated distribution outside the observed travel time interval. Let l obs = Φ −1 obs (α/2) and u obs = Φ −1 obs (1 − α/2) denote the lower and upper bounds of the observed travel time interval, respectively, at confidence level 1 − α, where Φ −1 obs (·) is the inverse CDF of the observed path travel time distribution, and Φ f us (·) denotes the CDF of the estimated travel time distribution. Then: POOI also ranges [0, 1], and larger POOI indicates lower estimated travel time interval accuracy, because a larger proportion is outside the observed travel time interval. Therefore, the POPI and POOI matrices are complementary to evaluate the estimated path travel time distribution accuracy.

Experimental Results
This section reports experimental results of the case study using the proposed heterogeneous data fusion method. Travel time distributions for the chosen path and links were estimated every 2 min. The probability of the unknown state for both interval and point detectors was set as α Θ = m int (Θ) = m pos (Θ) = 0.05, and sensitivity parameters in Equations (15) and (16) were set as β int = 0.2 and β poi = 0.8, according to the sensitive analysis results obtained from Dion and Rakha [34]. Setting β poi = 4β int assigns a higher level of information quality to the interval detector than point detector data, given the same sample sizes. Figure 5 shows two path travel time distributions, T od int and T od poi , estimated from interval and point detectors, respectively, in the data preprocessing step. Travel time intervals were constructed for the 95% confidence level, i.e., α Θ = 0.05, for both interval and point detectors, shown in blue and red, respectively. Observed data from the field survey, shown in green dots, were only used for accuracy validation. As shown in the figure, two estimated travel time intervals from different data sources can cover most observed data well during the period of interest. The two estimated travel time distributions show high consistency during off-peak periods (21:00-23:00 and 7:00-7:30), slight inconsistency during inter-peak periods (10:00-16:00), and high inconsistency during peak periods (7:30-10:00 and 16:00-21:00). In general, T od int tended to have higher accuracy than T od poi . This was expected, since T od int was estimated from interval detector data, whereas T od poi was estimated from point detector data through spatial interpolation. Such a result also justified the chosen sensitivity parameters, reflecting the higher level of information quality for the interval detector data. Figure 6 shows the resultant path travel time distribution after fusing the two path travel time distributions from Figure 5. A confidence level of 80%, i.e., α = 0.2, was used to construct the travel time interval and calculate POPI and POOI metrics. The proposed heterogeneous data fusion method provided an accurate and robust estimation of mean travel time, t od f us , throughout the period of interest, with MAPE t = 7.1%. However, the relative large MAPE σ = 17.9% showed that the proposed method overestimated path travel time distribution STD, σ od f us , for the period of interest. This highlights the challenge of accurately estimating σ od f us in congested road networks. One major reason may be the difficulty of estimating σ od obs of the population using biased samples with various data quality. Fortunately, the slight STD over estimation could be beneficial to most travelers with risk-averse attitudes regarding travel time uncertainty. POPI = 15.7%, somewhat better than the target (20%), which indicates that a high proportion (84.3%) of observation data was well covered by the estimated path travel time interval. It can also be seen from the figure that the estimated interval was not too wide, given the relative large STD error. POOI = 25.6%, which was somewhat larger than the target (20%). Thus, overall the POPI and POOI metrics verified that the proposed heterogeneous data fusion method could obtain accurate and robust estimations of the path travel time interval (i.e., path travel time distribution) by fusing heterogeneous interval and point detector data.   15.7% POPI  , somewhat better than the target (20%), which indicates that a high proportion (84.3%) of observation data was well covered by the estimated path travel time interval. It can also be seen from the figure that the estimated interval was not too wide, given the relative large STD error.
25.6% POOI  , which was somewhat larger than the target (20%). Thus, overall the POPI and POOI metrics verified that the proposed heterogeneous data fusion method could obtain accurate and robust estimations of the path travel time interval (i.e., path travel time distribution) by fusing heterogeneous interval and point detector data.

Comparison of Data Fusion and Single Data Source Results
In this section, the effectiveness of the proposed heterogeneous data fusion method was investigated by comparing data fusion results with those estimated from single data source. The estimated path travel time distribution (i.e., od int T ) from single interval detector data was shown in

Comparison of Data Fusion and Single Data Source Results
In this section, the effectiveness of the proposed heterogeneous data fusion method was investigated by comparing data fusion results with those estimated from single data source. The estimated path travel time distribution (i.e., T od int ) from single interval detector data was shown in Figure 5 in blue. The estimated path travel time distribution from single point detector data (denoted by T od poi ) was shown in Figure 7 in blue, which was different from the T od poi estimation shown in Figure 5. It should be noted that T od poi was generated using fixed offline spatial correlations obtained from RTIS, and T od poi was generated by the proposed heterogeneous data fusion method using the updated spatial correlations.

Comparison of Data Fusion and Single Data Source Results
In this section, the effectiveness of the proposed heterogeneous data fusion method was investigated by comparing data fusion results with those estimated from single data source. The estimated path travel time distribution (i.e., od int T ) from single interval detector data was shown in   Figure 7 shows travel time intervals of T od poi and T od poi in blue and red colors for comparison. The 80% confidence level was used for construing travel time intervals and calculating POPI and POOI metrics. As illustrated, by using updated spatial correlations, the accuracy of the path travel time distribution estimated from point detector data was significantly improved. The MAPE t , MAPE σ , POPI, and POOI metrics were reduced by 46.4% (i.e., 1-24.9%/46.5%), 78.9%, 21.1%, and 22.1%, respectively. This validates the effectiveness of the proposed optimization technique for updating travel time spatial correlations. Such a result also highlights the necessity for considering the dynamic nature of travel time spatial correlations in congested road networks, and implies that current spatial interpolation techniques [14,15] built on fixed spatial correlations may lead to considerable errors when imputing missing data.  Fusion of interval and point detector data can improve the accuracy of travel time distributions for links without point detectors. When only point detector data were used, travel time distributions for links without point detectors were indirectly estimated through the fixed spatial correlations. Fusing interval and point detector data provided better estimations of link travel time distributions from the updated spatial correlations. Figure 8 compares individual link travel time distributions estimated from point detector data and the proposed data fusion method. Ground truth data for these link travel time distributions were not available for quantitative analysis of estimation accuracy. Nevertheless, link travel time distributions estimated from the proposed heterogeneous data fusion method better capture dynamic traffic conditions, with more distinct peaks occurring during the morning and evening peak periods. The much superior accuracy of path travel time distribution estimation (see Table 3) also justifies this visual observation, because the path travel time distribution is the summation of corresponding link travel time distributions along the path.

Comparison of Different Distribution Fusion Algorithms
This section investigates the effectiveness of the proposed D-S distribution fusion algorithm built on the D-S evidence theory. To further evaluate and benchmark the proposed algorithm, a linear combination fusion algorithm built on the linear combination approach was also

Comparison of Different Distribution Fusion Algorithms
This section investigates the effectiveness of the proposed D-S distribution fusion algorithm built on the D-S evidence theory. To further evaluate and benchmark the proposed algorithm, a linear combination fusion algorithm built on the linear combination approach was also implemented. The linear combination approach (or simple convex combination approach) has been widely used as a simple and effective technique to fuse two independent estimations of mean travel times [11], where w int and w poi are the data quality of interval and point detectors, respectively, as defined in Equations (15) and (16). This study extended the linear combination approach to fuse two independent STD estimations, as: Assuming normal distributions, this extended linear combination fusion algorithm can be used to fuse path travel time distributions from interval and point detectors.
In this study, the same set of input data was used to validate the results of the proposed D-S distribution fusion and the linear combination fusion algorithms. Path travel time distributions of interval and point detectors obtained in the data preprocessing step, shown in Figure 5, were employed as the input data. Figure 9 reports the fused path travel time distributions using these two algorithms. As shown, the proposed D-S distribution fusion algorithm produces better of path travel time distribution estimates than the linear combination fusion algorithm. The proposed algorithm can significantly reduce MAPE t , MAPE σ , POPI, and POOI metrics by 58.6%, 15.3%, 37.2%, and 38.0%, respectively, compared to the linear combination fusion algorithm. This result indicates that the D-S evidence theory is effective for fusing inaccurate and inconsistent distribution data from multiple sources under various information conflict situations, including highly consistent, slightly inconsistent, and highly inconsistent situations.

Conclusions and Future Research
Provision of travel time distribution information is a crucial requirement for travelers to make reliable path choice decisions incorporating travel time uncertainties. With advances in information and communication technologies, interval detectors (such as automatic vehicle identification devices) and point detectors (such as loop detectors) are being increasingly deployed in road

Conclusions and Future Research
Provision of travel time distribution information is a crucial requirement for travelers to make reliable path choice decisions incorporating travel time uncertainties. With advances in information and communication technologies, interval detectors (such as automatic vehicle identification devices) and point detectors (such as loop detectors) are being increasingly deployed in road networks. These interval and point detectors generate heterogeneous data sources with distinct characteristics of data quality and network coverage. Fusing these heterogeneous data can be beneficial for robust and accurate estimation of travel time distribution information.
This paper proposed a heterogeneous data fusion method to estimate travel time distributions, fusing heterogeneous data from point and interval detectors. The proposed method consisted of three steps. The first step, i.e., data preprocessing, was to respectively estimate path travel time distributions from interval and point detector data. The spatially missing data issue of point detectors was addressed. The travel time distributions of links without point detectors were imputed based on their spatial correlations with links that had point detectors. The second step, i.e., distribution fusion, was to fuse these two path travel time distributions estimated from interval and point detectors. A D-S distribution fusion algorithm built on the Dempster-Shafer evidence theory was proposed to fuse path travel time distributions from different data sources with various information qualities. The third step, i.e., posterior update, was to update link travel time distributions and their spatial correlations. The problem of updating spatial correlations was formulated and solved as a quadratic programming problem with a convex objective function and two linear constraints.
To validate the accuracy of the proposed heterogeneous data fusion method, a case study was performed using real-world data from RTIS in Hong Kong. The results validated that the proposed method can obtain robust and accurate estimations of path travel time distributions in congested road networks. Compared with either interval or point detectors alone, the proposed data fusion method can significantly reduce estimation errors for path travel time distributions with respect to MAPE t , MAPE σ , POPI, and POOI metrics. The proposed D-S distribution fusion algorithm was also compared to a linear combination algorithm for the same case study, and it showed that the proposed D-S distribution fusion algorithm can generate a robust and accurate fusion of travel time distributions over the whole period of interest, including highly consistent, slightly inconsistent, and highly inconsistent situations for the different data sources. Furthermore, the results of the case study indicated that the proposed optimization technique can effectively update travel time spatial correlations under dynamic traffic conditions, and incorporation of updated spatial correlations greatly enhanced estimation accuracy of travel time distributions of the path and all links without point detectors. Therefore, the proposed D-S distribution algorithm was validated to be effective for fusing travel time distributions from different data sources under various information conflict situations, including highly consistent, slightly inconsistent, and highly inconsistent situations.
There are several worthwhile directions for future research. First, travel times in this study were assumed to follow normal distributions. However, several previous studies have found that travel times in congested road networks could be better represented by asymmetric distributions with strong positive skew, e.g., lognormal, gamma, or Burr distributions [10,57]. The proposed heterogeneous data fusion method can be easily extended to other types of distributions with two parameters, e.g., lognormal or gamma, by replacing Equation (14) with corresponding methods to calculate the cumulative distribution function. Second, the spatial interpolation proposed by Tam and Lam [14] was adopted in this study for imputing the travel time distributions of links without point detectors. However, other effective spatial interpolation techniques have been proposed, such as Kriging [15]. Integrating these alternative spatial interpolation techniques into the proposed heterogeneous data fusion method warrants further study. Third, the proposed data fusion method only considered heterogeneous data from point and interval detectors. How to extend the proposed method to incorporate floating car data needs further investigation. Fourth, the case study only involved a specific path. Extension of the proposed method to fuse travel time distributions of multiple paths between a pair of nodes is an interesting topic for further investigation. Last but not the least, travel time distributions were estimated in this study for the current time interval. Extension of the proposed data fusion method to the problem of short term travel time distribution prediction is another interesting topic for further study.