Next Article in Journal
Mathematics and Poetry—Epilogue for a Special Issue
Previous Article in Journal
σ-Hole Bonds and the VSEPR Model—From the Tetrahedral Structure to the Trigonal Bipyramid
Previous Article in Special Issue
A Hybrid Approach: Dynamic Diagnostic Rules for Sensor Systems in Industry 4.0 Generated by Online Hyperparameter Tuned Random Forest
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:

Reliability of Historical Car Data for Operating Speed Analysis along Road Networks

Department of Civil, Constructional and Environmental Engineering, University of Rome La Sapienza, Via Eudossiana 18, 00184 Rome, Italy
Author to whom correspondence should be addressed.
Submission received: 15 January 2022 / Revised: 17 March 2022 / Accepted: 30 March 2022 / Published: 21 April 2022
(This article belongs to the Special Issue Data Science for Industry 4.0. Theory and Applications)


In recent years, innovative progress in information and communication technology (ICT) has introduced new sources for traffic data collection and analysis. On-board sensors like GPS-GPRS boxes, generally installed for insurance purposes, communicate information from circulating vehicles to data centers. Geographic location, date and time, vehicles’ speed and direction, are systematically transmitted and stored as Historical Car Data (HCD) from probe vehicles in the traffic stream. These databases provide a good opportunity to analyze the vehicles’ motion both in the temporal and spatial domains. The aim of this study is to pay attention to the reliability of this kind of data gathering. Since instrumented vehicles account for a small percentage of the entire vehicle fleet, it is important to understand if they can be considered as a sample representative of the whole population. The paper presents a comparison of speed data obtained from HCD with the ones recorded by inductive-loop detectors and microwave radar sensors; the performed analysis required the definition of specific methodologies and procedures. The obtained results show a high correspondence between the two sets of data. Therefore, HCD can be proposed for the detailed monitoring of, and studies on, the operating conditions of mobility along road networks.

1. Introduction

Operating speeds are one of the crucial aspects for the management, monitoring and analysis of a road network; therefore, their assessment is an interesting and constantly evolving research topic [1,2]. In order to define traffic operating conditions based on actual speed data, different traffic measurement devices have been used over the years, which can be divided into two groups: static and dynamic ones.
Among the static devices, there are traditional traffic detectors, such as manual counting, as well as more advanced ones such as road detection stations, automatic traffic counters, video cameras, radar and laser guns, and point-based sensors, like microwave radars and acoustic sensors [3,4,5,6,7,8,9,10]. These methods have the valuable ability to sample the entire traffic flow, but at the same time show the impossibility of being able to guarantee a full coverage of the road network as it would require a significant economic investment. To overcome this limitation, innovative methods of collecting speed data have been tested through the use of vehicles equipped with GPS devices, capable of recording and sending travel information to operation centers with a high sampling rate [11,12,13,14,15,16]. This new data source allows researchers to observe real driver behavior and develop detailed models of operating speeds.
In contrast to punctual and isolated measurements, continuous speed data are useful for creating and studying actual driver speed profiles, in relation to the behavior adopted in different territorial contexts [17,18].
Recent innovative progress in the field of information and communication technologies (ICT) have further expanded and improved the connection between road users and the mobile network, simplifying the acquisition and exchange of a large amount of georeferenced data [19]. Specifically, the increasing use of mobile digital devices equipped with GPS, such as smartphones, tablets and black boxes mounted on vehicles for insurance purposes, generates a large and complex set of data, which are known as Big Data. It is therefore essential to define new tools and new methods able to store, manage and process them, through the definition of data-mining models [20,21,22,23,24].
The wide-scale collection of georeferenced data from road vehicles, known as Floating Car Data (FCD) if processed in real time or Historical Car Data (HCD) if collected and analyzed in different periods, represents one of the traffic monitoring tools today. It is a very effective and low-cost instrument for the study and evaluation of road traffic conditions. Its increasing use in road mobility analyses is due to the simplicity of the data acquisition method, as each vehicle anonymously sends data relating to the geolocation, speed, direction and date and travel time to a processing center. In particular, the FCD/HCD represent an alternative solution to measuring speeds, travel times and performances of the mobility system along a single road or the road network.
Since the vehicles which transmit data are free to move anywhere on the road network, they may be considered as floating probes in the traffic flow. Several studies have shown the potential of such data to deepen aspects such as drivers’ travel route prediction, proper perception by road users of the traffic signals, traffic conditions’ estimation and consistency evaluation between drivers’ behavior and theoretical design speeds in the various road elements [13,15,22,25,26,27,28].
Although probe vehicle data provide a large amount of information and represent an excellent technology to analyze the traffic conditions and to manage road mobility, they have the disadvantage of achieving a relatively low penetration rate. In fact, various studies state that vehicles equipped with GPS account for around 2–5% of the entire vehicle fleet [15,22,29,30], although in last few years this percentage has rapidly increased.
In order to conduct correct analyses of operating speeds, it is necessary to ensure that the data from the probe vehicles is adequately representative of the entire fleet, in terms of the number and type of vehicles. In recent years, several studies have tried to verify the reliability of these data samples, relying mainly on examining the accuracy of the FCD/HCD travel times and speeds’ estimation by comparing them with data recorded by point-based sensors; the assessments were carried out with regard to different categories of infrastructures and vehicles [31,32].
One of the earliest studies investigated the benefits and limitations of FCD technology by analyzing a particular case study that compared data obtained from a taxi-FCD fleet with data from license plate recognition (LPR). Travel times’ evaluation indicated that the taxi-FCD system could provide good information about travel times, but it was not sufficient for direct, sole application, due to the limited data available and the high variation in data coverage. However, it became a valuable tool for integrating the data collected by local sensors [33].
Subsequently, for the Beijing highways, which showed a high penetration rate of FCD, a regression analysis was performed between the data detected by the Remote Traffic Microwave Sensors (RTMS) and the speeds of the FCD, obtaining a high correlation between the two data samples with R2 equal to 0.97 [34].
Other studies have then assessed the speed quality of the dynamic surveys, comparing the measurements of the probe vehicles with the measurements of speed by the radar sensors. Good results were generally obtained, and in one case a regression analysis had showed a non-linear relationship with a correlation coefficient of 0.82 [35], while in another study it was observed that radar sensors tended to measure slower speeds than loop detectors during periods of free flow conditions [36]. In the latter case, it was found that radar measurements are generally good, but several aspects have been identified to consider before their implementation. Among these aspects, which can influence the quality of the measurements, are: lag, different data biases during the two phases of free and congested traffic, vulnerability to rainfall and sensitivity to the device mounting angle. On the other hand, FCD are certainly an important data source as they guarantee a higher spatial coverage and lower costs than the detectors installed. However, the accuracy and representativeness of the FCD sample is closely related to the number of probe vehicles and the quality of GPS geolocation and data transmission. Consequently, data from multiple sources are often managed and analyzed, i.e., static and floating sensors, because when used together they are capable of improving the knowledge of traffic status and of reducing the uncertainty of individual sources [14,37].
In this paper, in accordance with the scientific literature, the reliability of the HCD sample has been evaluated by directly comparing its information with the data recorded by inductive ring detectors and microwave radar sensors, in terms of speed distribution. The obtained results show a high correlation between the two data sources. A good representativeness of this type of speed data is demonstrated, despite the relatively low penetration rate of FCD.

2. Data and Methods

The analysis of the operating speed is an important topic to study in order to improve the knowledge of the operating conditions of a road network, as it allows the observation of the relationships between road users’ actual behavior and the characteristics of the infrastructure. Especially in road safety analyses, the operating speed is used to observe how it can be correlated with the frequency and severity of accidents along a route [38,39].
Generally, according to scientific literature, the operating speed is assumed to be the 85th percentile of the distribution of the actual speeds practiced by users [40,41]. Therefore, it is necessary that the data source adopted is as reliable as possible, in order to obtain operating speed profiles that effectively represent the actual trend. In the past, most operational speed models were based on speed data collected on specific road sections, through static acquisition methods, able to provide data only in the temporal domain. Subsequently, attempts were made to overcome this limitation by introducing and experimenting with innovative methods, in particular by collecting speed data from vehicles equipped with GPS systems. This new methodology allowed the researchers to observe the actual behavior of drivers and to develop more effective operating speed models, so providing a more accurate representation of the real phenomenon [42,43,44,45,46,47,48,49]. It has to be noted that the data from the point-based sensors refer only to the road sections in which they are installed, providing information only in the temporal domain. The probe vehicles, on the other hand, allow the observation of the data in the temporal and spatial domains, providing information on mobility for the entire route. Despite the advantages related to this new method of data collection, the reliability of speed data obtained from “floating cars” needs to be verified, due to the notable lack of instrumented vehicles when compared to the entire fleet.
In order to investigate this aspect, in this research the performed analyses were focused on identifying a correspondence between the HCD obtained from a sample of instrumented vehicles and the data returned by one of the classic traffic detection methods. In particular, it was decided to carry out a comparison of the HCD with the data collected by the Automatic Statistical Traffic Detection System of the Italian national road network manager, ANAS SpA. The Traffic Observatory is a structure that the company mainly uses to provide traffic data and information to users; all the sensors send their data to a central Platform for Monitoring and Analysis—called PANAMA—and the reliability of the acquired data is ensured by a series of control procedures [50].
In particular, the study analyzed the data that came from the measurement sections placed along the two-lane rural roads of the Veneto Region, managed by the ANAS Company, as shown in Figure 1:
A detailed traffic survey was available for each control unit. Among the different information, the variables employed for the analysis are shown in Table 1. In detail, Time Reference is the date and time of the data acquisition, Lane represents the lane on which the vehicle was traveling, Direction indicates the direction of travel, Speed (km/h) is the value of the speed recorded by the control unit and Vehicle Class is a code assigned to identify the different types of vehicles (i.e., 1–9 classes).
The sample of Historical Car Data, on the other hand, was acquired from a commercial operator who has GPS data coming anonymously from over four million black boxes mounted on passenger cars and heavy vehicles, in addition to those generated by 1.5 million Apps downloaded to customers’ smartphones [51].
The actual speeds’ extraction from Big Data did not take place in real time, but the HCD were used; these data were subsequently processed by means of a relational database management software with which it is possible to store and manage large amounts of data. Table 2 shows the most important information that can be extracted from the dataset for the proposed study: Identification code; Longitude (in WGS 84 coordinates); Latitude (in WGS 84 coordinates); Direction is the vehicle’s travel direction expressed as azimuth angle; Speed; Date and Time of the signal emission; Signal Quality; Vehicle ID is a code assigned by the collection center to each individual vehicle during its travel; Vehicle Type.
The two data sources (point-based sensors and HCD) have provided information relating to a three months’ period, respectively, August 2018, February 2019 and May 2019, and located within the Veneto Region. Specifically, for the time period and the location analyzed, the HCD data sample amounts to almost one billion items of data, as shown in the following scheme (Table 3).
A study of the statistical reliability of the data detected by the measurement sections was preliminarily performed. At first, the traffic flows recorded by the control units were divided into the two main travel directions, characterized by the lane (1 or 2)-direction (ascendant and descendant) combination, equal to “1A” and “2D”. Subsequently, the overtaking situations were identified by the combinations “1D” and “2A”.
The raw data show a trend similar to the Gaussian distribution, as it is generally recognized in the literature [52]. As can be seen from Figure 2, the average value is located near the center of the distribution and the trend is mainly symmetrical to it. Furthermore, most of the surveys are located around the mean value, specifically it is noted that 75% of the speed values are in the range (μ − σ, μ + σ), in terms of mean square deviations, and that 95% of the speed values are in the range (μ − 2σ, μ + 2σ).
These trends confirm that the experimental data can be represented by a statistical model, whose parameters can be known and are shown, as an example, in Figure 2. Indeed, the trend is in agreement with the classical statistical modeling of traffic flows; the data sample does not present anomalies due to local or special phenomena. From the observation of the reliability and representativeness of this sample, it is possible to use them for more advanced studies, such as the one addressed in this study.
The HCD data sample contains all the data sent from the instrumented vehicles within the Veneto Region; therefore, data filtering operations were performed, in order to continue with their reliability research. The study started with the geometric reconstruction of the road alignments, using an automated, economical and rapid method capable of identifying the elements of existing road layouts through the georeferenced vertices of the road graph [53]. Once the horizontal alignment of the road layouts were known, a map-matching procedure was performed, by which the geographic coordinates of the vehicles were matched to the graph of each road [54,55,56,57,58,59,60].
The map matching was carried out using a programming code, with which the raw HCD near the examined road are preliminarily identified and subsequently projected onto the curvilinear abscissa. However, since the GPS signal on board the vehicles could be affected by intrinsic or accidental errors, the position accuracy could be low; the area of investigation was expanded to avoid eliminating from the analysis those vehicles that had a slightly inaccurate location compared to their actual position. At the end of the map-matching procedure, the point speed data extracted from the HCD were divided into the two main travel directions.
The two analyzed data samples differed in the domain in which they were defined. The point-based sensors provided a very detailed traffic survey in the temporal domain, at the forced and constant sections where they were installed. Instead, the HCD allowed the observation of a tiny sample of the vehicle fleet along the entire road network, providing data in both the temporal and spatial domains. For this reason, to carry out a direct comparison between the two data samples, the proposed analysis method involved a preliminary restriction of the spatial domain of the HCD, taking into consideration the only data was located 10 m forward and 10 m behind the control unit. The reason for this spatial interval is to consider the possibility that the vehicle recorded by the control unit may have emitted the signal not exactly in correspondence with the measurement section, but a few meters before or after it. The HCD sample is constituted of temporal-spatial information whose sampling frequency is 1 Hz. This high sampling rate is sufficient to consider the HCD located just 10 m forward and 10 m behind the control unit. If the frequency was lower the spatial interval would be much bigger, with the risk of contributing to a less accurate analysis, because of the speeds changing along the explored road stretch.
After the first filtering operation on the HCD sample, it was possible to evaluate a representative rate of the GPS data around each analyzed control unit. The evaluation was performed as a ratio of the HCD, located near the control unit, and the vehicles recorded by the control unit itself. Table 4 lists the values of the relationship between the probe vehicles and the entire vehicle flow passing through the sections where the point-based sensors are located, which are about 1–2‰, with an average value of 1.4‰.
A study of the statistical parameters of the GPS data set has been performed through a preliminary histogram representation of the HCD speeds, as shown in Figure 3. The speed data suggest again a shape similar to the one of a Gaussian distribution, even if the two diagrams show some irregularities due to the low amount of data in some speed classes. However, it has to be noticed that the statistical parameters of the distributions of both sensors and HCD are comparable. Therefore, it can be stated that a small sample of data (the HCD) can represent the behavior of the entire traffic flow.
Finally, in order to complete the overall evaluation of the reliability of the HCD and their correspondence with the data recorded by the control units, it has been necessary to extend the analysis of the two samples also in the temporal domain. The second filtering operation of the HCD sample identified the data from the probe vehicles in the same time frame defined by the traffic measurement stations. A methodology was therefore defined that made it possible to directly compare the speed data extracted from the two different sources, evaluating their coherence and correspondence through a linear regression. The reliability of the obtained result can thus be interpreted by means of the coefficient of determination R2, which measures the weak or strong linear relationship between the two variables compared, assuming a value between 0 and 1.

3. Results

The reliability evaluation of the HCD has been performed by direct comparison with the traffic data recorded by a set of point-based sensors placed in the investigated road network. Due to the big difference between the two datasets, as the HCD sample provides space-time information while the control units only record data in the time domain, it has been necessary to properly filter the HCD.
Firstly, the HCD close to the measurement stations have been identified and selected; then, as shown in Figure 4, the speed data extracted from the two different sources have been overlapped in the same graph as a function of time. In Figure 4, the representation of the Gaussian distributions of the two different data samples is flanked (on the right side) by the speed-time graph (on the left side).
From the graphs in Figure 4, the following observations can be deduced:
  • Observing the Gaussian distributions, an overlap of the two trends can be noticed, and it demonstrates how the two samples can be considered statistically coincident, as their main parameters are almost equal;
  • The mean and the standard deviations linear regression lines of the two data samples in the speed-time diagrams approximate the mean and standard deviations of the Gaussian distributions displayed in the diagrams on the right;
  • The point cloud of the control units’ data completely encloses the HCD values. As has been said before, the probe vehicles account for a small percentage of the vehicle fleet, which is, instead, completely detected by the fixed sensors;
  • Fluctuations in the graphs, especially their peaks and troughs, are similar between the HCD and the point-based-sensor point cloud; therefore, a matching between the qualitative trends of the data samples can be observed;
  • The linear regressions on the data show almost constant values as the lines are slightly inclined, with angular coefficients close to zero and intercept equal to the average speed;
  • There is a minimal difference between the average speed values, evaluated, respectively, from the two data samples, which generally assumes a value of around 3 km/h. These minor differences are probably caused by systematic errors, linked to sensor calibration defects or to errors caused by the relative angle between the signals emitted and received by the radar sensors and the vehicles’ driving direction. The point-based sensors are in fact located on the sides of the carriageway and are not in line with the lanes;
  • The two linear regression lines have been moved vertically on and under by a value equal to the corresponding standard deviation. In this way, it can be observed that the data dispersion is almost coincident between the two samples and that the only difference is due to the systematic error;
  • Most of the HCD fall into the range defined by the ± σ linear regression parallel, thus demonstrating the strong reliability of the sample, which is located around the average speed values.
The two tables in Appendix A (Table A1 and Table A2) show the results from all the measurement sections that were analyzed. The results indicate the values of the angular coefficient m, the intercept q of the regression lines, the average speed μ referred to each monitored month and the standard deviation σ of the speeds for both the data recorded by the control units and for the HCD.
A second data filtering has been performed on the HCD to identify the information emitted by the probe vehicles in a time interval coincident with the measurements made by the point-based sensors. The results have been plotted and are shown in Figure 5: the graphs show the HCD speeds on the x-axis and the speeds recorded by the control units on the y-axis.
As shown in Figure 5, the reliability assessment was performed considering the linear regression between the speed data from the two different data sources, and the following observations are carried out:
  • The very good correspondence between the data is proved by the points’ arrangement and concentration along the diagram bisector: speed data acquired by the HCD has been also recorded by the control units, with minimal deviations;
  • A regression line facilitates the graph readability: it immediately shows how much the dispersion of the recorded data approaches or deviates from the bisector, by means of its angular coefficient value;
  • Therefore, the data correspondence is also readable through the coefficient of determination R2, which is almost always close to 1;
  • Minor deviations come from outlier points within the sample, generally related to very low speed values of the HCD. Probably these anomalies are related to vehicles, close to the control unit, performing maneuvers outside the carriageway. This is an intrinsic limit due to the characteristics of the GPS systems;
  • The problem observed in the previous point cannot be completely ignored, but paying attention to the central zone of the graph, it is always found that most of the points are near the bisector, and it corresponds to the most plausible speeds assumed along the examined roads.
The two diagrams shown in Figure 6 summarize the results obtained, shown as a cumulative curve, with which it is possible to observe the percentage of cases with a specific value of R2. The percentage distribution of the results shows that almost 80% of the analyzed comparisons return an R2 greater than 0.8, demonstrating the reliability of the HCD sample and its application for more advanced monitoring studies.
The two tables (Table A3 and Table A4) presented in Appendix A, one per each driving direction, are presented to summarize the angular coefficient m, the intercept q of the regression lines, and the coefficient of determination values obtained for all the measurement sections.

4. Discussion

The proposed original method for evaluating the reliability of a sample of HCD obtains good results, by comparing speed values from probe vehicles and measurement stations. However, as can be seen in Table A3 and Table A4, not all the results show a good correspondence between the two samples of data, since the angular coefficient m and the coefficient of determination R2 in some cases are far from value 1. However, these results are acceptable considering the hypotheses and the aims of this study. First, it should be noted that the research carried out does not attempt to establish an instantaneous comparison in time and space between the GPS data and the traffic measurements performed by the control units, but a correspondence of data both in statistical terms and in terms of actual vehicular speeds. Therefore, thanks to the study, a reliable and representative data source of the driving population can be identified in the small sample of HCD, even if a very high match is not achieved in all cases.
The high sampling rate of GPS data has allowed us to carry out an accurate analysis through the different filtering phases. The most important filtering operation was to consider in the analysis only the HCD emitted 10 m forward and 10 m behind the control unit. When HCD have been recorded in mountainous environments, technical problems related to both poor satellite coverage and quality of GPS signal were found. The probe vehicles, in fact, send little or incorrect information in that territory. Another reason to explain anomalous information can be related to the location of some measurement sections along the infrastructure, for example near tunnels. The presence of the tunnel determines a GPS signal loss, resulting in low data quality; consequently, this kind of data could be not representative and have been excluded from the analysis. Observing Figure 5, the presence of outliers affects the final outcome of the analysis, which results in R2 values not close to 1. By examining the proposed calculation method, it can be said that outliers may have been generated by different causes. It is likely that some vehicles had a lower-than-average emission frequency, and so did not register in the defined spatial and temporal interval around the control unit. As a result, the data recorded by the control unit did not find the corresponding vehicle in the filtered sample of the HCD. In relation to the frequency, it may happen that a given GPS signal is emitted inside the analyzed interval, but towards the beginning of it: in this case the driver could change the driving speed in the meantime, so that the control unit may have detected the same vehicle but at a different speed. Moreover, by investigating geographical maps, in cases where the examined road section presents another parallel road close to it, it is possible that the geolocalized data does not belong to the studied road, due to errors related to the GPS detection system. In general, although sometimes less valid results in terms of determination coefficient have been found, in most cases the reliability of the HCD sample should be considered acceptable. In Figure 5, ignoring the outliers, it is notable that there is a concentration of the speed data along the bisector, especially in the speed range between 50–80 km/h, which represents indeed the range of operating speeds on the examined roads. Therefore, the point cloud thus distributed shows that the data obtained from probe vehicles are actually reliable and can be considered representative of the entire vehicle fleet, except for some individual cases that relate to the availability or quality of HCD.

5. Conclusions

This paper presents a reliability study of Historical Car Data (HCD), evaluating their correlation in terms of actual vehicular speeds with the data recorded by sensors installed along existing roads. The sample of the analyzed data revealed that it is possible to obtain information from them to describe operational variables of the entire vehicular flow, in terms of speed profile and the typical behavior of user drivers.
The main objective of this study is to understand the reliability of this kind of data, with the aim of validating them, so the data can be used for all analyses that are needed in road safety studies and road network management. Regardless of the different traffic conditions across different time periods of the day, a daily analysis of the data has not been developed because that was not a specific aim of the study. Another limitation of the study is that, considering that the environmental and traffic conditions are different between rural and urban roads, the research wanted to observe the operational parameters of traffic flow only under the unconstrained conditions that are typical of rural roads. Therefore, the results of the study allow general conclusions to be obtained about the reliability of HCD which are valid only for this kind of context.
The achieved results show a high HCD reliability and representativeness of the traffic flow, achieving R2 > 0.9 in most cases. The statistical comparison of the GPS data with a typically recognized source, such as that of the control units, allows the opportunity to be extended to utilize HCD samples for safety and management assessments on road networks, despite the fact that, to date, the instrumented vehicles account for a small percentage of the entire vehicle fleet.
The sample of the control units captures a great proportion of all the traffic flow, with the exception of those vehicles which, for example, decide to enter or exit from the surveyed road, right before the sensor location; therefore, it represents a large sample, that it might almost represent the entire population. In contrast, the HCD sample is very small, and, on average, accounts for 1.4‰ of the traffic surveyed by the sensor; however, focusing on their statistical distributions and related variables, they have lot of similarities to each other. As a result, both the control units and HCD can be used to represent key features of the whole population, like traffic flow characteristics and operating conditions. In particular, the control units provide complete information about traffic stream and its sensitivity to certain variables, such as environmental conditions, differences between hours of night and day or the presence of heavy vehicles in the traffic flow. The HCD, instead, are useful to make data available about both the single driver’s behavior and the vehicular path and speed along the entire route. In other words, the control units are especially useful for data about traffic patterns, while GPS data are suitable to perform a monitoring of the vehicular motion along the entire infrastructure. This conclusion agrees with the literature, where HCD have been proposed to study drivers’ travel route prediction, the proper perception by road users of the traffic signals, traffic conditions’ estimation and consistency evaluation between drivers’ behavior and theoretical design speeds in the various road elements.
More generally, the results obtained from this research support the thesis that Big Data, and specifically HCD, can be adopted as an essential data source to carry out important analyses on the operational conditions of road traffic. Furthermore, by exploiting the large amount of information associated with each type of data, it is possible to extend the survey by identifying many factors that influence the traffic conditions, even if the evaluation analyses require specific methodologies and procedures.
In particular, the Historical Car Data allows the extraction of continuous operating speed profiles along infrastructures and networks, because of the data’s geolocation along the entire road layout and the high sampling rate. It should be noted that this data source allows the main limits associated with the traditional methods of traffic surveys to be overcome, with which the information is related only to the sections in which the sensors are installed.
In light of the statistical reliability and trustworthiness of the HCD sample, the authors want to extend the research to implementing innovative mobility analysis processes, thanks to the quantitative and qualitative advantages offered by the HCD. In this way, a detailed and continuous monitoring of the actual operating conditions of road traffic can be performed.

Author Contributions

Conceptualization, G.C. and G.D.S.; methodology, G.D.S.; software, P.P.; validation, G.C., G.D.S. and P.P.; formal analysis, G.C.; investigation, P.P.; resources, G.C.; data curation, P.P.; writing—original draft preparation, P.P.; writing—review and editing, G.C. and G.D.S.; visualization, P.P.; supervision, G.C.; project administration, G.D.S.; funding acquisition, G.C. All authors have read and agreed to the published version of the manuscript.


This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to confidentiality issues and respect for privacy.


We thank the national public company ANAS for making its Traffic Observatory information available and for the technical support.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Table A1. HCD and sensors angular coefficients m, and intercepts q of the regression lines, average speeds μ, and the standard deviations σ referred to each monitored month in Dir AB.
Table A1. HCD and sensors angular coefficients m, and intercepts q of the regression lines, average speeds μ, and the standard deviations σ referred to each monitored month in Dir AB.
Historical Car DataSensor Recordings
Sensor IDRoad NameMonthmqμσmqΜσ
197SS12August 20180.2159.2962.1113.75−0.0964.8163.3711.3
February 2019−0.1575.1573.0813.80.0365.8166.3212.11
May 2019−0.0472.772.0712.260.0165.6965.8812.42
208SS13August 20180.0267.5367.810.22−0.0470.3669.7710.69
February 2019065.865.779.530.0267.9668.299.99
May 2019−0.0166.0265.869.520.0367.5268.0311.01
209SS13August 20180.2157.0860.0310.82−0.0361.9261.4512.17
February 20190.0256.4856.7310.920.0258.2758.5711.52
May 2019−0.1261.9260.2511.02058.4858.5311.63
920,074SS13August 20180.2365.2168.4712.32−0.0373.0972.6312.46
February 20190.0366.7867.21110.0368.9769.3911.21
May 20190.1764.7267.1812.58070.0370.0111.78
218SS14August 2018−0.1482.3180.3511.10.0277.9578.3213.56
February 20190.0776.7777.7712.32−0.0378.9878.4813.75
May 20190.0377.7778.1210.67−0.0377.3976.9314.56
219SS14August 2018−0.1174.1172.5211.210.0774.175.0612.47
February 2019−0.2272.1368.7514.55−0.0373.6473.1413.24
May 20190.0771.2872.3111.67−0.0874.9573.6412.16
3191SS14August 2018−0.1374.7272.7310.960.0673.6574.5810.72
February 2019−0.0478.8978.2610.710.0180.6180.7112.23
May 2019−0.1279.1677.3211.11077.9877.9911.6
481SS50August 20180.268.2571.2411.52−0.0264.8564.6110.59
February 2019−0.1574.172.1311.220.2259.263.2310.73
May 20190.2668.5672.4311.96−0.0865.5464.3410.32
482SS50August 20180.2277.2680.5311.34−0.3976.2271.1619.38
February 2019−0.2187.0184.0815.440.1477.6877.819.61
May 20190.1279.6281.0710.05−1.0771.3955.1429.78
2404SS50May 2019−0.1271.2669.2110.13−0.1283.4481.9313.38
487SS51August 2018−0.4664.8860.612.4−0.0661.1460.2813.62
February 2019−0.1567.5966.2312.70.0565.2766.0112.54
May 2019−0.4878.1171.414.59−0.0365.7865.2513.35
489SS51August 20180.0165.4565.638.15−0.0367.4967.0912.53
February 20190.1163.3164.8210.650.0767.368.3712.55
May 20190.1361.9463.8610.52−0.0468.6568.0812.53
490SS51August 20180.2551.3954.816.660.1556.5858.8414.89
February 2019−0.0965.8964.5910.290.0265.6966.0210.15
May 2019−0.0466.9266.2910.92−0.0768.3867.210.15
491SS51August 20180.0943.3544.665.2−0.0341.9341.465.67
February 20190.0645.2646.196.80.1640.442.86.49
May 2019−0.0948.3447.35.86−0.0444.8444.26.54
492SS51August 20180.1366.2468.2310.92−0.0161.2361.0711.93
February 2019−0.1171.6670.1610.550.4659.8162.9310.02
10,040SS51August 2018−1.7672.995030.130.0284.0384.2813.8
February 20193.5420.8461.230.880.4281.9487.6718.58
May 2019079790−0.0293.2293.0415.08
920,075SS51August 20180.1449.551.586.040.0154.3654.549.36
494SS51-bisAugust 20180.0754.6555.557.080.0150.8250.917.31
February 20190.0754.9456.028.170.3446.9353.198.14
May 2019−0.1758.8756.166.36−0.0154.0153.858.08
498SS52August 20186.4217.965123.45−0.0872.5971.4415.58
February 2019\\\\\\\\
May 2019\\\\\\\\
499SS52August 2018−0.0451.5751.065.440.0452.4352.999.15
February 20190.1450.4952.328.140.6643.0354.4911.04
May 20190494900.0157.0157.1711.27
3193SS52August 2018059.3659.48.62−0.0155.255.019.49
February 2019−0.0765.9659.730.2757.4161.3610.58
May 2019−0.2171.7668.979.29−0.0865.0763.8111.27
920,076SS52August 20180373730.0738.6239.7210.05
February 20190.54243.52.120.1340.7843.299.69
May 201986,400.0335361.410.1142.3444.1910.92
503SS53August 20180.0263.5763.8314.030.1965.168.0911.9
February 20190.0565.2265.978.180.0169.2269.359.05
May 2019−0.0765.7464.738.79−0.0568.5467.79.62
1332SS309August 20180.3664.970.712.68−0.0273.7773.4512.39
1333SS309August 20180.0162.2262.3112.27−0.1166.7565.0112.37
1563SS434August 2018−0.1994.4791.4716.48−0.0596.7395.9814.6
1703SS516August 20180.0269.0869.3610.11−0.174.4572.8915.94
February 2019−0.1574.7672.5711.08−0.0577.0276.6914.44
May 2019−0.0572.5271.9311.660.0874.5475.7813.88
Table A2. HCD and sensors angular coefficients m, and intercepts q of the regression lines, average speeds μ, and the standard deviations σ referred to each monitored month in Dir BA.
Table A2. HCD and sensors angular coefficients m, and intercepts q of the regression lines, average speeds μ, and the standard deviations σ referred to each monitored month in Dir BA.
Historical Car DataSensor Recordings
Sensor IDRoad NameMonthmqμσmqμσ
197SS12August 2018−0.4576.1469.1512.080.1164.2965.8813.59
February 2019−0.4281.2775.5515.480.0670.8171.5414.44
May 2019−0.7283.6673.1414.460.0169.9770.0614.19
208SS13August 2018−0.169.0967.59.92−0.0264.7564.379.92
February 20190.0565.8266.548.720.0262.4262.769.29
May 2019067.4467.489.710.0162.6662.819.39
209SS13August 20180.1456.8359.1110.52−0.0764.563.412.46
February 20190.1355.7757.589.820.0260.0160.2611.69
May 20190.2154.5557.5710.740.0360.961.311.27
920,074SS13August 20180.0466.7467.447.69−0.0269.296912.03
February 20190.1565.2267.459.91−0.0166.5766.4811.37
May 20190.1365.6367.529.96067.0667.0211.9
218SS14August 2018−0.1483.5781.3912.90.0279.5179.7813.53
February 2019−0.4385.0279.4913.04079.1679.0914.34
May 2019−0.1779.2176.7712.260.0477.8778.4513.8
219SS14August 2018−0.1474.0871.910.59−0.0262.5262.220.04
February 20190.0470.2370.789.390.1456.0858.4226.71
May 2019−0.1672.8870.6110.98−0.1367.8265.7221.32
3191SS14August 2018070.6770.629.94−0.0170.370.212.19
February 2019−0.0176.8976.7410.53075.6675.713.23
May 20190.0274.73759.970.0373.7774.212.43
481SS50August 20180.0371.271.7411.73077.2977.313.07
February 2019070.8570.8212.420.2273.2877.2113.12
May 20190.0171.8371.9811.63−0.0578.4377.712.87
482SS50August 20180.1573.3775.6711.850.0575.7976.5612.03
February 2019076.576.5413.690.274.8578.4313.44
May 2019−0.0579.6378.9516.12−0.0378.778.2612.69
2404SS50May 20190.5867.2775.618.87−0.1183.0881.7913.69
487SS51August 20180.4851.9157.8511.210.0560.4961.2914.39
February 2019−0.2871.9167.5813.230.0368.0968.5514.59
May 2019−0.0460.4460.0311.88−0.1169.8768.2115.79
489SS51August 20180.1562.8265.138.9−0.0366.0165.5511.89
February 2019−0.0369.0868.6813.920.0963.6364.911.74
May 2019−0.1269.6167.7512.97−0.0465.9565.2611.92
490SS51August 2018−0.0358.9258.458.76−0.0158.9658.759.98
February 20190.1259.7661.69.98−0.0362.5162.1311.1
May 20190.0965.0266.379.65−0.0765.6864.5410.88
491SS51August 2018−0.0344.5944.084.43050.7950.846.9
February 2019−0.0846.5545.385.580.2149.6252.717.51
May 2019−0.1849.0546.65.8−0.0555.354.447.9
492SS51August 2018−0.0562.69629.9072.1472.1612.09
February 2019−0.0967.2666.029.420.4970.773.8111.19
10,040SS51August 20180.5255.0863.5922.03−0.0773.1372.1313.43
February 20190.6559.9269.7221.840.0673.2674.1619.28
May 2019−0.1778.0475.913.15−0.282.0278.9814.42
920,075SS51August 20180.151.7353.296.86−0.0355.154.668.89
494SS51-bisAugust 20180.0557.1857.967.350.0260.8761.149.38
February 2019−0.0160.4960.46.470.3558.2864.5310.49
May 2019−0.1464.6462.896.9−0.0165.9165.7510.64
498SS52August 20180.1460.2562.268.31−0.0671.4370.613.13
February 2019−0.3972.5867.1812.020.3372.9777.7914.3
May 20190.0270.170.447.05−0.0482.3181.8413.98
499SS52August 20180.1153.2854.887.81053.8453.918.35
February 20190.4149.3254.539.580.8540.7955.611.28
May 2019−0.3361.1656.296.34−0.0160.5860.4910.66
3193SS52August 2018−0.156.4454.968.70.0262.1962.399.42
February 20190.0260.7460.968.850.362.9467.310.48
May 2019064.764.6710.14−0.0569.7968.9511.15
920,076SS52August 20180.0341.9842.366.070.0243.0943.3812.28
February 20190515100.241.6645.6211.03
May 20190.3937.1538.55.740.0945.4946.9412.13
503SS53August 2018−0.0565.0664.249.930.2261.4864.8910.49
February 2019−0.0364.9564.517.310.0165.5265.698.27
May 2019−0.0264.9964.658.17−0.0365.0764.678.66
1332SS309August 201806867.9210.56071.8471.8312.67
1333SS309August 20180.2558.0762.0812.88−0.1468.2966.1414.67
1563SS434August 20180.2779.7783.9114.5−0.0583.0582.2314.25
1703SS516August 20180.0467.6768.218.54−0.1875.7873.0814.13
February 2019−0.1676.7674.4112.11−0.1477.6976.6413.86
May 20190.1971.674.2714.310.0275.0275.2713.44
Table A3. Comparison between HCD and sensors speeds through angular coefficients m, and intercepts q of the regression lines, and coefficient of determination R2 in Dir AB.
Table A3. Comparison between HCD and sensors speeds through angular coefficients m, and intercepts q of the regression lines, and coefficient of determination R2 in Dir AB.
Sensor IDRoad NamemqR2
Table A4. Comparison between HCD and sensors speeds through angular coefficients m, and intercepts q of the regression lines, and coefficient of determination R2 in Dir BA.
Table A4. Comparison between HCD and sensors speeds through angular coefficients m, and intercepts q of the regression lines, and coefficient of determination R2 in Dir BA.
Sensor IDRoad NamemqR2


  1. Esposito, T.; Mauro, R.; Russo, F.; Dell’Acqua, G. Operating speed prediction models for sustainable road safety management. In International Conference on Sustainable Design and Construction (ICSDC) 2011; ASCE: Reston, VA, USA, 2012; pp. 712–721. [Google Scholar] [CrossRef]
  2. Misaghi, P.; Hassan, Y. Modeling operating speed and speed differential on two-lane rural roads. J. Transp. Eng. 2005, 131, 408–418. [Google Scholar] [CrossRef]
  3. De Luca, M.; Lamberti, R.; Dell’Acqua, G. Freeway Free Flow Speed: A Case Study in Italy. Procedia-Soc. Behav. Sci. 2012, 54, 628–636. [Google Scholar] [CrossRef] [Green Version]
  4. Dell’Acqua, G. European Speed Environment Model for Highway Design-Consistency. Mod. Appl. Sci. 2012, 6, 1–10. [Google Scholar] [CrossRef]
  5. Hashim, I.H. Analysis of speed characteristics for rural two-lane roads: A field study from Minoufiya Governorate, Egypt. Ain Shams Eng. J. 2011, 2, 43–52. [Google Scholar] [CrossRef] [Green Version]
  6. Lobo, A.; Rodrigues, C.; Couto, A. Free-Flow Speed Model Based on Portuguese Roadway Design Features for Two-Lane Highways. Transp. Res. Rec. J. Transp. Res. Board 2013, 2348, 12–18. [Google Scholar] [CrossRef] [Green Version]
  7. Ottesen, J.L.; Krammes, R.A. Speed-Profile Model for a Design-Consistency Evaluation Procedure in the United States. Transp. Res. Rec. J. Transp. Res. Board 2000, 1701, 76–85. [Google Scholar] [CrossRef]
  8. Bassani, M.; Cirillo, C.; Molinari, S.; Tremblay, J.-M. Random Effect Models to Predict Operating Speed Distribution on Rural Two-Lane Highways. J. Transp. Eng. 2016, 142, 04016019. [Google Scholar] [CrossRef] [Green Version]
  9. Cantisani, G.; Del Serrone, G.; Di Biagio, G. Calibration and validation of and results from a micro-simulation model to explore drivers’ actual use of acceleration lanes. Simul. Model. Pract. Theory 2018, 89, 82–99. [Google Scholar] [CrossRef]
  10. Cantisani, G. Results of Micro-Simulation Model for Exploring Drivers’ Behavior on Acceleration Lanes. Eur. Transp. Eur. 2020, 77, 1–10. [Google Scholar] [CrossRef]
  11. Eboli, L.; Guido, G.; Mazzulla, G.; Pungillo, G. Experimental Relationships between operating speeds of successive road design elements in two-lane rural highways. Transport 2015, 32, 138–145. [Google Scholar] [CrossRef] [Green Version]
  12. Castro, M. Automated GIS-Based System for Speed Estimation. J. Comput. Civ. Eng. 2008, 22, 325–331. [Google Scholar] [CrossRef]
  13. Astarita, V.; Giofrè, V.P.; Guido, G.; Vitale, A. A review of traffic signal control methods and experiments based on Floating Car Data (FCD). Procedia Comput. Sci. 2020, 175, 745–751. [Google Scholar] [CrossRef]
  14. Gitahi, J.; Hahn, M.; Storz, M.; Bernhard, C.; Feldges, M.; Nordentoft, R. Multi-sensor traffic data fusion for congestion detection and tracking. ISPRS-Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2020, XLIII-B1-2, 173–180. [Google Scholar] [CrossRef]
  15. Ajmar, A.; Arco, E.; Boccardo, P.; Perez, F. Floating car data (FCD) for mobility applications. ISPRS-Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2019, XLII-2/W13, 1517–1523. [Google Scholar] [CrossRef] [Green Version]
  16. Ma, W.; Qian, S. High-Resolution Traffic Sensing with Probe Autonomous Vehicles: A data-driven approach. Sensors 2021, 21, 464. [Google Scholar] [CrossRef]
  17. Del Serrone, G. Analisi Di Floating Car Data (FCD). 2020. Available online: (accessed on 13 January 2022).
  18. Talebpour, A.; Mahmassani, H.S. Influence of connected and autonomous vehicles on traffic flow stability and throughput. Transp. Res. Part C Emerg. Technol. 2016, 71, 143–163. [Google Scholar] [CrossRef]
  19. Verma, D.; Varghese, V.; Jana, A. Applicability of Big Data for transportation planning and management. In Advances in Urban Planning in Developing Nations; Routledge: England, UK, 2021; pp. 99–110. [Google Scholar]
  20. Joe Grengs, L.K.; Wang, X. Using GPS Data to Understand Driving Behavior. J. Urban Technol. 2008, 15, 33–53. [Google Scholar] [CrossRef]
  21. Leduc, G. Road Traffic Data: Collection Methods and Applications. EUR Number Tech. 2008, 47967, 55. Available online: (accessed on 15 December 2021).
  22. Fusco, G.; Colombaroni, C.; Isaenko, N. Short-term speed predictions exploiting big data on large urban road networks. Transp. Res. Part C Emerg. Technol. 2016, 73, 183–201. [Google Scholar] [CrossRef]
  23. Valenti, G.; Liberto, C.; Mastroianni, P. L’importanza dei big data sulla mobilità urbana. Energ. Ambiente Innov. 2016, 42–47. [Google Scholar] [CrossRef]
  24. Wu, X.; Zhu, X.; Wu, G.-Q.; Ding, W. Data mining with big data. IEEE Trans. Knowl. Data Eng. 2013, 26, 97–107. [Google Scholar] [CrossRef]
  25. Dabbas, H.; Fourati, W.; Friedrich, B. Using Floating Car Data in Route Choice Modelling-Field Study. Transp. Res. Procedia 2021, 52, 700–707. [Google Scholar] [CrossRef]
  26. Andersen, C.S.; Reinau, K.H.; Agerholm, N. The Relationship between Road Characteristics and Speed Collected from Floating Car Data. J. Traffic Transp. Eng. 2016, 4, 1–10. [Google Scholar] [CrossRef] [Green Version]
  27. Colombaroni, C.; Fusco, G.; Isaenko, N. Analysis of Road Safety Speed from Floating Car Data. Transp. Res. Procedia 2020, 45, 898–905. [Google Scholar] [CrossRef]
  28. Seo, T.; Kusakabe, T. Probe Vehicle-based Traffic Flow Estimation Method without Fundamental Diagram. Transp. Res. Procedia 2015, 9, 149–163. [Google Scholar] [CrossRef] [Green Version]
  29. Wagner, P. How many Floating Car Data (FCD) are needed for Traffic Management? In Proceedings of the 4th International Symposium Networks for Mobility, Stuttgart, Germany, 25 September 2008; pp. 1–29. Available online: (accessed on 8 January 2022).
  30. Fourati, W.; Dabbas, H.; Friedrich, B. Estimation of Penetration Rates of Floating Car Data at Signalized Intersections. Transp. Res. Procedia 2021, 52, 228–235. [Google Scholar] [CrossRef]
  31. Chase, R.T.; Williams, B.M.; Rouphail, N.M.; Kim, S. Comparative Evaluation of Reported Speeds from Corresponding Fixed-Point and Probe-Based Detection Systems. Transp. Res. Rec. J. Transp. Res. Board 2012, 2308, 110–119. [Google Scholar] [CrossRef]
  32. Karlsson, T. An Observational Study of the Characteristics of Taxi Floating Car Data Compared to Radar Sensor Data. No. 2012:051. 2012. Available online: (accessed on 27 December 2021).
  33. Brockfeld, E.; Lorkowski, S.; Mieth, P.; Wagner, P. Benefits and Limits of Recent Floating Car Data Technology—An Evaluation Study. In Proceedings of the 11th WCTR Conference, Berkeley, CA, USA, 24 June 2007; pp. 24–28. Available online: (accessed on 3 January 2022).
  34. Zhao, N.; Yu, L.; Zhao, H.; Guo, J.; Wen, H. Analysis of traffic flow characteristics on ring expressways in Beijing: Using floating car data and remote traffic microwave sensor data. Transp. Res. Rec. 2009, 2124, 178–185. [Google Scholar] [CrossRef]
  35. Altintasi, O.; Tuydes-Yaman, H.; Tuncay, K. Quality of floating car data (FCD) as a surrogate measure for urban arterial speed. Can. J. Civ. Eng. 2019, 46, 1187–1198. [Google Scholar] [CrossRef]
  36. Kim, S.; Coifman, B. Assessing the Performance of SpeedInfo Radar Traffic Sensors. J. Intell. Transp. Syst. 2016, 21, 179–189. [Google Scholar] [CrossRef]
  37. Croce, A.; Musolino, G.; Rindone, C.; Vitetta, A. Estimation of Travel Demand Models with Limited Information: Floating Car Data for Parameters’ Calibration. Sustainability 2021, 13, 8838. [Google Scholar] [CrossRef]
  38. Hossain, F.; Medina, J.C. Effects of Operating Speed and Traffic Flow on Severe and Fatal Crashes using the U.S. Road Assessment Program Methodology and Field Data Verification. Transp. Res. Rec. J. Transp. Res. Board 2020, 2674, 30–41. [Google Scholar] [CrossRef]
  39. Park, E.S.; Fitzpatrick, K.; Das, S.; Avelar, R. Exploration of the relationship among roadway characteristics, operating speed, and crashes for city streets using path analysis. Accid. Anal. Prev. 2020, 150, 105896. [Google Scholar] [CrossRef]
  40. Transportation Officials. A Policy on Geometric Design of Highways and Streets; AASHTO: Washington, DC, USA, 2011. [Google Scholar]
  41. Tottadi, K.K.; Mehar, A. Operating speed: Review and recommendations for future research. Innov. Infrastruct. Solut. 2021, 7, 67. [Google Scholar] [CrossRef]
  42. Cafiso, S.; Di Graziano, A.; La Cava, G. Actual Driving Data Analysis for Design Consistency Evaluation. Transp. Res. Rec. J. Transp. Res. Board 2005, 1912, 19–30. [Google Scholar] [CrossRef]
  43. Cafiso, S.; Cerni, G. New Approach to Defining Continuous Speed Profile Models for Two-Lane Rural Roads. Transp. Res. Rec. J. Transp. Res. Board 2012, 2309, 157–167. [Google Scholar] [CrossRef]
  44. Fabrizi, V.; Ragona, R. A pattern matching approach to speed forecasting of traffic networks. Eur. Transp. Res. Rev. 2014, 6, 333–342. [Google Scholar] [CrossRef] [Green Version]
  45. Silva, A.B.; Almeida, R.; Vasconcelos, L. A speed model for curves of two-lane rural highways based on continuous speed data. In Transport Infrastructure and Systems, Proceedings of the AIIT International Congress on Transport Infrastructure and Systems; TIS: Rome, Italy; CRC Press: Boca Raton, FL, USA, 2017; pp. 177–183. [Google Scholar]
  46. Javier, F.; Torregrosa, C. New Geometric Design Consistency Model Based on. Accid. Anal. Prev. 2013, 61, 33–42. [Google Scholar]
  47. Zuriaga, A.M.P.; García, A.G.; Torregrosa, F.J.C.; D’Attoma, P. Modeling Operating Speed and Deceleration on Two-Lane Rural Roads with Global Positioning System Data. Transp. Res. Rec. J. Transp. Res. Board 2010, 2171, 11–20. [Google Scholar] [CrossRef]
  48. Hashim, I.H.; Abdel-Wahed, T.A.; Moustafa, Y. Toward an operating speed profile model for rural two-lane roads in Egypt. J. Traffic Transp. Eng. 2016, 3, 82–88. [Google Scholar] [CrossRef] [Green Version]
  49. Cvitanić, D.; Maljković, B. Operating speed models of two-lane rural state roads developed on continuous speed data. Teh. Vjesn.-Tech. Gaz. 2017, 24, 1915–1921. [Google Scholar] [CrossRef]
  50. Osservatorio-Del-Traffico @ Available online: (accessed on 12 January 2022).
  51. Index @ Www.Infoblu.It. Available online: (accessed on 29 November 2021).
  52. Jabari, S.E.; Liu, H. A stochastic model of traffic flow: Gaussian approximation and estimation. Transp. Res. Part B Methodol. 2013, 47, 15–41. [Google Scholar] [CrossRef]
  53. Cantisani, G.; Del Serrone, G. Procedure for the Identification of Existing Roads Alignment from Georeferenced Points Database. Infrastructures 2021, 6, 2. [Google Scholar] [CrossRef]
  54. Cho, W.; Choi, E. A GPS Trajectory Map-Matching Mechanism with DTG Big Data on the HBase System. ACM Int. Conf. Proceed. Ser. 2015, 22–23, 22–29. [Google Scholar] [CrossRef]
  55. Mclaughlin, S.B.; Hankey, J.M. Matching GPS Records to Digital Map Data: Algorithm Overview and Application. NSTSCE; 15-UT-033. 2015. Available online: (accessed on 12 January 2022).
  56. Chen, F.; Shen, M.; Tang, Y. Local Path Searching Based Map Matching Algorithm for Floating Car Data. Procedia Environ. Sci. 2011, 10, 576–582. [Google Scholar] [CrossRef] [Green Version]
  57. Miwa, T.; Kiuchi, D.; Yamamoto, T.; Morikawa, T. Development of map matching algorithm for low frequency probe data. Transp. Res. Part C Emerg. Technol. 2012, 22, 132–145. [Google Scholar] [CrossRef]
  58. Quddus, M.; Ochieng, W.Y.; Noland, R. Current map-matching algorithms for transport applications: State-of-the art and future research directions. Transp. Res. Part C Emerg. Technol. 2007, 15, 312–328. [Google Scholar] [CrossRef] [Green Version]
  59. Xi, L.; Liu, Q.; Li, M.; Liu, Z. Map Matching Algorithm and Its Application. In Proceedings of the Intelligent Systems and Knowledge Engineering (ISKE2007), Chengdu, China, 15–16 October 2007. [Google Scholar] [CrossRef] [Green Version]
  60. Chen, B.Y.; Yuan, H.; Li, Q.; Lam, W.H.K.; Shaw, S.-L.; Yan, K. Map-matching algorithm for large-scale low-frequency floating car data. Int. J. Geogr. Inf. Sci. 2013, 28, 22–38. [Google Scholar] [CrossRef]
Figure 1. Location of the measuring sections along the state roads of the Veneto Region, reported in the legend.
Figure 1. Location of the measuring sections along the state roads of the Veneto Region, reported in the legend.
Sci 04 00018 g001
Figure 2. Control units’ speed samples’ distributions: (a) Control unit 1333, AB direction, August 2018; (b) Control unit 490, AB direction, February 2019.
Figure 2. Control units’ speed samples’ distributions: (a) Control unit 1333, AB direction, August 2018; (b) Control unit 490, AB direction, February 2019.
Sci 04 00018 g002
Figure 3. HCD speed samples’ distributions: (a) close to the control unit 1333, AB direction, August 2018; (b) close to the control unit 490, AB direction, February 2019.
Figure 3. HCD speed samples’ distributions: (a) close to the control unit 1333, AB direction, August 2018; (b) close to the control unit 490, AB direction, February 2019.
Sci 04 00018 g003
Figure 4. Control units vs. HCD speed samples and linear regressions as a function of time: (a) Control unit 1333, AB direction, August 2018; (b) Control unit 490, AB direction, February 2019; (c) Control unit 218, BA direction, February 2019; (d) Control unit 1563, BA direction, August 2018.
Figure 4. Control units vs. HCD speed samples and linear regressions as a function of time: (a) Control unit 1333, AB direction, August 2018; (b) Control unit 490, AB direction, February 2019; (c) Control unit 218, BA direction, February 2019; (d) Control unit 1563, BA direction, August 2018.
Sci 04 00018 g004aSci 04 00018 g004b
Figure 5. HCD and sensors’ speeds’ samples correlation: (a) Control unit 1332, BA direction; (b) Control unit 1333, AB direction; (c) Control unit 490, BA direction; (d) Control unit 218, BA direction.
Figure 5. HCD and sensors’ speeds’ samples correlation: (a) Control unit 1332, BA direction; (b) Control unit 1333, AB direction; (c) Control unit 490, BA direction; (d) Control unit 218, BA direction.
Sci 04 00018 g005aSci 04 00018 g005b
Figure 6. Cumulative percentage diagram of the obtained values of the coefficients of determination R2: (a) Percentage distribution of the coefficients of determination R2 for the whole sample of analyzed data, evaluated for the driving direction AB; (b) Percentage distribution of the obtained values of the coefficients of determination R2 for the whole sample of analyzed data, evaluated for the driving direction BA.
Figure 6. Cumulative percentage diagram of the obtained values of the coefficients of determination R2: (a) Percentage distribution of the coefficients of determination R2 for the whole sample of analyzed data, evaluated for the driving direction AB; (b) Percentage distribution of the obtained values of the coefficients of determination R2 for the whole sample of analyzed data, evaluated for the driving direction BA.
Sci 04 00018 g006
Table 1. Example of Control unit traffic survey.
Table 1. Example of Control unit traffic survey.
LaneDirectionVehicle Speed
1 May 2019 00:00:041A622
1 May 2019 00:00:061A602
1 May 2019 00:00:081A612
1 May 2019 00:00:091A662
1 May 2019 00:00:151A704
1 May 2019 00:00:211A728
1 May 2019 00:00:222D662
Table 2. Example of Historical Car Data information.
Table 2. Example of Historical Car Data information.
IDLongLatDirSpeed (Km/h)Date and TimeSignal QualityVehicle IDVehicle Type
112.038345.4160326171 February 2019 06:3911.23137 × 1018A
211.691245.619919121 February 2019 09:3511.23139 × 1018C
311.871345.392216601 February 2019 12:0811.23145 × 1018A
411.871345.392216601 February 2019 12:0911.23145 × 1018A
512.313345.6682185451 February 2019 18:2011.23155 × 1018C
612.304545.6644199511 February 2019 18:2111.23155 × 1018C
712.134446.538731101 February 2019 09:5811.23141 × 1018C
Table 3. HCD sample within the Veneto Region in the three analyzed months.
Table 3. HCD sample within the Veneto Region in the three analyzed months.
Total HCD927′733′936
August 2018309′060′029
February 2019315′978′826
May 2019302′695′081
Table 4. Values of the representativeness rate of the HCD sample evaluated for each control unit.
Table 4. Values of the representativeness rate of the HCD sample evaluated for each control unit.
Sensor IDRoad NameRatio (‰)
Average rate1.4
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Cantisani, G.; Del Serrone, G.; Peluso, P. Reliability of Historical Car Data for Operating Speed Analysis along Road Networks. Sci 2022, 4, 18.

AMA Style

Cantisani G, Del Serrone G, Peluso P. Reliability of Historical Car Data for Operating Speed Analysis along Road Networks. Sci. 2022; 4(2):18.

Chicago/Turabian Style

Cantisani, Giuseppe, Giulia Del Serrone, and Paolo Peluso. 2022. "Reliability of Historical Car Data for Operating Speed Analysis along Road Networks" Sci 4, no. 2: 18.

Article Metrics

Back to TopTop