A Study of Trafﬁc Emissions Based on Floating Car Data for Urban Scale Air Quality Applications

: Urban air quality in cities is strongly inﬂuenced by road trafﬁc emissions. Micro-scale models have often been used to evaluate the pollutant concentrations at the scale of the order of meters for estimating citizen exposure. Nonetheless, retrieving emissions information with the required spatial and temporal details is still not an easy task. In this work, we use our modelling system PMSS (Parallel Micro Swift Spray) with an emission dataset based on Floating Car Data (FCD), containing hourly data for a large number of road links within a 1 × 1 km 2 domain in the city of Rome for the month of May 2013. The procedures to obtain both the emission database and the PMSS simulations are hosted on CRESCO (Computational Centre for Research on Complex Systems)/ENEAGRID HPC facilities managed by ENEA. The possibility of using such detailed emissions, coupled with HPC performance, represents a desirable goal for microscale modeling that can allow such modeling systems to be employed in quasi-real time and nowcasting applications. We compute NOx concentrations obtained by: (i) emissions coming from prescribed hourly modulations of three types of roads, based on vehicle ﬂux data in the FCD dataset, and (ii) emissions from the FCD dataset integrated into our modelling chain. The results of the simulations are then compared to concentrations measured at an urban trafﬁc station.


Introduction
Urban air quality is determined by complex atmospheric patterns, influenced by local emissions, and shaped by the three-dimensional structure of the built environment. The so called "street canyons" (which is a term frequently used for urban streets flanked by buildings on both sides) tend to entrap pollution near the ground, while in more open spaces (parks, squares, residential areas) the pollution levels take the form of an urban background, with increasing impact of more distant sources [1]. Among the sources of local emission, road traffic is usually the main contributor [2,3]. A study from The European Topic Centre on Air and Climate Change (ETC/ACC) [4] through a survey among European cities reported average percentage contributions from road traffic ranging from 40% (21%) to 49% (28%) of nitrogen dioxide (particulate matter) concentrations measured at background stations.
Microscale dispersion models have been used to evaluate pollutant concentrations at high spatial detail, featuring grids with metrical resolution in order to estimate citizen exposure more accurately than other air quality models at larger scales [5,6]. For species mainly driven by local emissions, such as nitrogen oxides (NOx), such a detailed model description of dispersion dynamics requires a coherent emission input, therefore reproducing the spatial variability at the single street/stack level and the temporal variability at hourly or even sub-hourly levels. For air quality applications, the calculation of street-level and hourly-level emissions covering large urban areas is not straightforward. Traffic flows and speeds on urban road networks at the street level are usually obtained as long-term averages by traffic assignment models calibrated from observations; as shown in the review by [7], these models are the typical input for Static Emission Models, commonly used for transportation planning purposes due to their relative simplicity but are mostly adequate when a high resolution is not required. Therefore, in order to obtain the hourly variations, modulation profiles are needed which generally come from dedicated measurements or the literature and are not street-specific [8,9]. Though it is able to obtain a comprehensive coverage of urban areas for long time periods, this approach has shown limitations in reproducing measured street level concentrations, namely in their hourly variability, and therefore it can lead to biased estimations of human exposure to pollutant concentrations. More recently, some studies have considered new types of measurements. Video-based systems allow both vehicle flows to be counted and the fleet composition to be retrieved [10], but at a fixed location and therefore with limited spatial coverage. Remote sensing of actual vehicle positions (FCD, Floating Car Data) can deliver very detailed vehicle passages on wider areas by tagging each single vehicle at very short time steps. e.g., 30 s [11,12]. This allows very detailed vehicle flows on individual streets and hourly intervals to be derived, improving emission calculations. Jing et al., reported a large part (60 km × 60 km) of the Beijing road network divided into segments according to the traffic speed, with an attribution of the traffic flow and speed for each segment [13]. On the other hand, depending on the area and the period considered, the storage and elaboration of FCD can require important resources, limiting their use on comprehensive urban road networks (e.g., Gately et al. used FCD only for vehicle speeds and in-road sensors for vehicle flows, while Jiang et al. studied the link between traffic speed and volume on a one ring expressway in Beijing) [14,15]. Therefore, there is at the moment no universal solution to evaluate microscale traffic emissions over an extended part of a city at a reasonable cost [8].
In this work, we describe a microscale simulation of NOx concentrations, conducted with the Parallel Micro Swift Spray (PMSS) model, and fed by an emission dataset based on FCD, containing hourly data for a large number of road links within the city of Rome for the month of May 2013. In this work, all the simulations were performed relying on the computational resources of CRESCO (Computational Centre for Research on Complex Systems) /ENEAGRID High Performance Computing infrastructure [16].
The main objective was to evaluate NOx concentrations simulated by PMSS using FCD-based emissions in comparison to NOx concentrations measured at the Magna Grecia urban air quality station. We computed NOx concentrations obtained by: (i) emissions coming from prescribed hourly modulations of three types of roads, based on vehicle flux data in the Floating Car dataset, and (ii) emissions from the Floating Car dataset integrated into our modelling chain. We also present an exploration of the FCD-emissions database, evaluating the feasibility of automatically integrating it into our modelling chain and possible alternative strategies for a more straightforward use as a model input.
Section 2 presents a detailed description of the emission database, the other input data, the modelling chain, and the simulations setup. Section 3 presents the results and discussions with an evaluation of the modelled NOx concentrations against the reference measurements (Section 3.1), and a detailed comparison of the traffic emissions processors used to retrieve the mass emitted starting from the vehicle flux information (Section 3.2). Section 4 presents our conclusions. Some useful additional graphics and definitions are included in Appendices A and B.

Road Traffic SMTS Data
Nowadays, the use of massive FCD to extract traffic patterns and travel behaviors occurring in urban areas is extremely appealing [17][18][19]. It represents a reliable and costeffective way to gather accurate traffic data over a wide-area road network and thus to improve many applications, such as location-based services [20], urban planning [21], and traffic management [22,23]. Despite this remarkable potential, FCD exploitation in urban transport is still at an early stage compared to other approaches [24,25], particularly due to the fragmented availability of transport/mobility data, institutional barriers, and data privacy/security issues. In this study we present ENEA's STMS (Systems and Technologies for Sustainable Mobility) laboratory FCD collected by Octo Telematics to obtain insights into the travel patterns of private cars and to estimate the traffic emissions in the case study. Octo Telematics is a company that offers data analytics for the auto insurance industry and provides other innovative connected user services including vehicle diagnostics, fleet management, road tolling, and real-time monitoring of traffic and environmental conditions [26]. The FCD used in this study represents about 5 percent of the circulating passenger cars in the study area. Similar datasets have been used in the past to calculate vehicle home locations to predict energy-oriented land use [27]. The residential population calculated with FCD was compared to a census population showing a R-squared value of 0.74, proving therefore that the initial FCD dataset was adequately representative of the mobility patterns of the entire road network, though covering a fraction of all vehicle movements. Cars were equipped with an OBU (On-Board Unit) that stores GPS measurements (position, heading, speed, quality) and, periodically, transmits them to the Data Processing Center. The OBU consists of a GPS receiver, a GPRS transmitter, a 3-axis accelerometer sensor, a battery pack, a mass memory, a processor, and a RAM. The OBU stores GPS measurements every 2 km travelled or, alternatively, every 30 s when the vehicle is running along a motorway or some main urban arterials. For each equipped vehicle, we then extracted the travel list performed and the most likely routes in the network by matching sequences of the positioning data to a street digital map. We reconstructed the route between each OD (origin-destination) pair by applying a map-matching algorithm that incorporates the street network topology, including prohibited maneuvers and turn restrictions information. The database at this point was used to evaluate the emitted mass per link per hour using the traffic emission processor ECOTRIP [28].

Emission Processors
In this work, we performed simulations using emission data obtained with two different emission processors: TREFIC [29], which is the native traffic emissions pre-processor for the model PMSS, and ECOTRIP. TREFIC is based on COPERT 4 [30] methodology for the calculation of the road vehicle emission factors. In order to calculate emissions, TREFIC takes into account vehicle type, fuel consumption, average travelling speed, and road type. TREFIC performs a reading and processing cycle for each road link. The input consists of 4 groups of files, related to the road network (geometry, speed, and volume of traffic flows, for each link of the network), vehicle fleet (split into COPERT 4 categories, for each of the road types or driving cycles), time modulations (tables of values which allow the time profiles of emissions to be quantified) and COPERT 4 methodology emission factors. Starting from the input information and for each road link, TREFIC calculates the emission factors (EFs) for each road type. These emission factors depend on fuel type, vehicle type, age and maintenance, road average speed, and driving cycle. If specific information is available, EFs can take into account the ambient average temperature (cold start and evaporative emissions), the average slope of the road, and the actual average load (for freight vehicles). Time modulation files, in text format, contain coefficients representing the time modulation factors of flow, speed, and temperature. These files allow a modulated input to be generated to the dispersion model and they support the user in generating time emission profiles for traffic flows, speed, and temperature. At each run, TREFIC generates at least three standard output files, containing aggregated emissions according to the temporal step in the input file, respectively, for traditional pollutants, particulate matter (PM) species, and evaporative losses.
ECOTRIP (Emission and Consumption Calculation Software Based on Trip Data Measured by Vehicle On-Board Unit) [28,31,32] is software developed by STMS to estimate atmospheric pollutant emissions (Carbon Monoxide, NOx, Non-Methane Hydrocarbons, and PM), greenhouse gas emissions (Carbon Dioxide), and fuel consumption produced by vehicle fleets. ECOTRIP is capable of carrying out a precise and georeferenced estimate of fuel consumption and polluting emissions produced by any type of vehicle in circulation equipped with on-board units. The innovative nature and originality of the ECOTRIP software derives from the ability to use data on actual routes, on the driving cycle, and on the characteristics of the vehicle, as well as the ability to operate on different levels of aggregation and detail and potentially in real time. ECOTRIP can be a valid tool to support vehicle traffic monitoring and mobility management activities.
ECOTRIP has been widely used and updated within the Electric System Research Programme supported by the Italian Ministry of Economic Development [28].
The evaluation procedures consider the speed-dependent hot emission factors described in the COPERT 4 guidebook [33] which were obtained from several experimental measurements collected in different European countries. These factors vary according to the fuel supply, the European Emission Standards, the engine size for passenger cars, and the weight for commercial vehicles and buses. ECOTRIP has been updated in order to include recent European Emission Standards (Euro 5 and Euro 6), as well as hybrid, electric, light-commercial and heavy-duty vehicles, buses, mopeds, and motorcycles. In addition to hot running emissions, ECOTRIP accounts for the "cold start" emissions, which occur when engines and catalysts are not (fully) warmed up and operate in a non-optimal condition. The estimation of the extra cold emissions refers to the methodology developed by INRETS (Institut national de recherche sur les transports et leur sécurité), which is based on several experimental tests performed in different European laboratories, as described in the ARTEMIS European Project [34]. In this study, ECOTRIP estimated the pollutant emissions from equipped cars. Estimates were carried out for each segment located between two consecutive GPS traces of a journey, combining the vehicle features together with the geographical information of the routes, then map-matched to the static road network.

Dispersion Model
Micro-SWIFT-SPRAY (PMSS) is a modelling system which reproduces primary pollutant transport and dispersion at the microscale (i.e., resolution of meters), and calculates the dry and wet deposition of airborne chemical species. PMSS is the parallelized version of the MSS model suite, which is fully described in several papers [35][36][37]. Here we provide a summary of its main characteristics, schematically represented in Figure 1.
The system has two pre-processing phases for the meteorological and the emission data, respectively. These modules prepare the input for the main processing models, the meteorological driver PSWIFT, an analytically modified mass consistent interpolator over complex terrain [38], and the three-dimensional Lagrangian particle dispersion model, PSPRAY. In the meteorological pre-processing phase, the meteorological data and the turbulence parameters provided at a local or regional scale are elaborated by using SURFPRO, the surface-atmosphere interface processor [39,40], to generate the input files required to run PSWIFT at a much higher spatial resolution (generally of the order of a few meters). In the emission pre-processing phase, the emission data and their spatial and temporal variations are used by the emission manager TREFIC to produce the emission input for PSPRAY. Different types of emission sources, such as a point, area, or line, can be simulated.
Here we considered only line sources to study the primary pollutant dispersion at the urban scale caused by traffic. Obstacles, such as buildings, are directly considered in the model and are represented as filled cells in the meteorological field [35,36]. PSWIFT produces mass-consistent wind fields using data from a dispersed meteorological network or from simulated meteorological data at a lower resolution. PSPRAY calculates the pollutant concentration by means of "virtual" particles that carry a portion of the pollutant mass emitted by the sources. The velocity of the particles is calculated from a mean velocity component, defined by the local wind computed by PSWIFT, and a stochastic velocity component, representing atmospheric turbulence. PSPRAY can compute mean and instantaneous concentrations on a three-dimensional grid defined by the user, differentiating the calculation by both "chemical species" and "source". PSPRAY only simulates the dispersion of atmospheric compounds in the urban environment and cannot take into account the transformations due to chemical reactions. Recent developments have reported the implementation of several chemical models into PSPRAY to consider chemical reactions occurring at the urban scale [36]. However, in the present work, we considered only the dispersion characteristics of the PMSS system, neglecting chemical transformations. The modelling system PMSS is a commercially available software developed by ARIANET [41]. The codes PSWIFT and PSPRAY used here, in their versions PSWIFT-2.1.1 and PSPRAY-3.7.3, were compiled with an Intel16 compiler, using OPENMPI library.

Simulation Setup
We performed simulations for a period of 29 days from 2 May to 30 May 2013. We used the hourly fluxes of passenger cars provided by the STMS database described in Section 2.1. We set a 2 × 2 km 2 horizontal domain that was centered on the urban traffic air quality (AQ) station Magna Grecia. The domain is indicated by the red square in Figure 2 and covered an adequate fraction of the emissions of the city while providing that the AQ station was far enough from the domain border, where the model uncertainty is generally higher. The spatial resolution of 3 m was chosen to ensure the highest resolution possible while keeping the run time within acceptable values, and the domain was composed of 667 × 667 grid points. For the vertical grid, we chose the following 25 levels above the ground: 0, 1.  We conducted the simulations on the CRESCO/ENEAGRID High Performance Computing infrastructure funded by ENEA [25]. As the model system was structured, each 29-day-simulation consisted of 29 single model runs, each simulating 24 h. The restart option was applied, therefore, for each simulated day the values for the pollutant concentrations calculated for the last hour were saved and were used as initial conditions for the following run.
The simulation duration depended mainly on the number of emitted particles, and this in turn depended on the concentration resolution required, that is the concentration contribution given by a single particle in a concentration cell. In addition, the meteorological conditions can also play a key role in constraining a different number of particles to remain inside the computational domain depending on the mean flow and turbulence. In our simulations, using a concentration resolution of 0.5 µg/m 3 for the NOx species and using 528 cores for PSPRAY, which represents the most CPU demanding part of the system, the CPU time per core per simulated day was 8148 s.

Input Meteorological Data
Meteorological data used to feed the diagnostic model PSWIFT were provided by the Weather Research and Forecasting [42] mesoscale model using ERA5 reanalysis [43] as boundary conditions at a 28 km resolution and a 3-hourly time-step. WRF simulations, performed using the 3.9.1.1 model version, were based on two-way nesting over 3 grids, the coarser one covering the whole of Italy at a 9 km horizontal resolution (over Italy), then an intermediate domain at a 3 km resolution (over the Lazio region), and finally the target domain at a 1 km resolution over the city of Rome.
WRF parameterizations adopted for the simulation are summarized in Table 1. Hourly data of the meteorological fields were used by PSWIFT to reconstruct the three-dimensional wind, temperature, and turbulent flow at a 3 m resolution.

Emissions
The Emission database described earlier includes the vehicle hourly fluxes determined by FCD. Therefore, the vehicle fluxes considered in the database were relative only to the circulating passenger cars. However, it was possible to estimate the emissions of the remaining part of the circulating fleet by considering the vehicle population in the city of Rome for the year 2013. Data relative to the circulating fleet needed for the emission estimate, such as vehicle type and fuel technology distribution, were retrieved from public registers of vehicle licenses [51]. By considering this information, we calculated a scaling factor for each of the remaining vehicle categories not included in the FCD database (motorcycles, light duty vehicles, heavy duty vehicles), in order to extrapolate their fluxes starting from the hourly fluxes of the passenger cars. These factors led to a vehicle fleet composition shown in Table 2. To calculate the emissions in terms of mass per hour, it was necessary to define at least one daily modulation profile for each vehicle category. TREFIC in fact requires as input the total flux per vehicle category and a separated time modulation profile. This allows for a variety of approaches regarding the time modulations (i.e., top-down emission definitions). Given the limitation on the number of input hourly profiles, a detailed inspection of the FCD was necessary to estimate the variability and the statistical significance of the modulation profiles computed from the emission database, with the aim to identify a small set of representative profiles.
The study of the most representative modulation profiles within the database was limited to the road links in close proximity to the monitoring station, shown in Figure 3, which were most likely to influence the hourly variability of the simulated NOx concentrations at the station.
The area examined covered about 300 × 300 m 2 and included 24 street links. The 2 May was chosen as a reference day and each hourly profile was scaled to its total daily fluxes for each day. From the database inspection, 3 average modulation profiles were determined, depending on the range of total flux, shown in Table 3. The profiles are shown in Figure 4.   The profiles shown in the picture had similar behaviors, indicating a decrease in traffic during the night and an increase during the day, but they also had different behaviors for the hours at which the peaks were located: -Profile 1 (prof1): shows a steady increase in traffic from the very early morning to midday when it started decreasing more slowly than during the morning increase; -Profile 2 (prof2): shows 3 peaks, one in the morning at 10, one at 12-13, and the other at 18; -Profile 3 (prof3): shows two relevant peaks: one at 13 and the other at 18.
These profiles are relative to 2 May. They were averaged over a few road links around the stations and are therefore representative of the traffic modulation in that particular location. Applying these profiles for the modulation of the dataset was certainly a significant approximation, the degree of which was tested through dedicated model runs as follows.
The validation of the observed concentrations was not the only goal of this study. A very interesting feature of the ENEA STMS emission database is its ability to potentially deliver an FCD-based emission database in near real-time. The possibility to incorporate this emission input in our PMSS modeling chain represents a foreseeable development of our modeling tools and needs to be investigated.
For these reasons, we performed different simulations with the different emission configurations that are listed in Table 4 and described hereinafter: -Sim 1: we used the passenger cars traffic fluxes from the FCD database, the modulation profiles described in Figure 4, and the relative percentages listed in Table 2 to calculate with TREFIC the traffic emissions due to passenger cars, motorcycles, heavy duty vehicles, and light duty vehicles to be used as input to PMSS; -Sim 2: this simulation was similar to Sim 1, and the calculation was performed only for passenger cars; -Sim 3: we used as the input for PMSS the emissions already present in the FCD database in terms of emitted mass/unit time. These emissions were calculated using the emission processor ECOTRIP considering only passenger car traffic fluxes. Among these simulations, only Sim 1 could be compared to the observations, having a complete coverage of the different types of emitting vehicles. However, Sim 2 and Sim 3 offered an interesting opportunity to compare the calculations of two different emission processors and to validate the time modulation assumption described earlier. In fact, Sim 2 and Sim 3 differed for two reasons:

•
The operation of translating fluxes into emissions was completed using two different traffic processors, which despite being both based on COPERT 4 methodologies, still showed some differences (not shown here); • Different hourly time modulations were used: in Sim 2 they followed Figure 4 profiles and were assigned following Table 3, while in Sim 3 they pertained to the variations of emissions in the database, and were therefore potentially different for each road link.
The comparison between Sim 2 and Sim 3 allowed the influence of the approximation of the time modulations to be understood.

NOx Background
Since our simulation only took into account primary NOx emitted in the study domain, to analyze how well our model reproduced the total NOx concentrations, we needed to estimate the background contribution, i.e., the NOx entering the domain from outside. One possible way to do this was to use measurements of the urban background NOx concentrations outside the domain, assuming that they were not influenced by sources internal to the domain (to avoid a double counting of emissions) [52,53]. In particular, in this study, the urban NOx background was calculated by taking into consideration the average of the NOx concentrations measured by the background air quality reference stations located around the simulation area, which are shown in Figure 5. For these urban background AQ stations, we considered the data provided by the European Environmental Agency Air Quality portal [54] for May 2013 focusing on the pollutant "NOx as NO 2 ". By using R-cran base packages [55][56][57][58], we calculated the concentration time series for May 2013 of the main statistics parameters (mean and percentiles values of the distribution) from the concentrations of all the background stations. Figure 6 reports the average and median background concentrations (panel a) and the average, minimum, and maximum values of the background NOx concentrations (Panel b). We also included the NOx concentration measured at the urban traffic AQ station of Magna Grecia, as shown in Panel b in Figure 6.
The NOx concentration measured at Magna Grecia is always higher than the background with an average percentage difference of 39% and an absolute difference of 34 µg/m 3 between the monthly average values. Therefore, it is reasonable to assume that the measured concentrations observed at the Magna Grecia station are the sum of a background component and an additional component generated by local emissions in the immediate surroundings of the station. This background component was added, hour by hour, to the NOx concentration simulation to compare the results of Sim1 with the observations. For Sim 2 and Sim 3, since the simulated NOx was not comparable with the NOx observed at the urban traffic station, the NOx background concentration was not added.

Results and Discussion
In the following paragraphs, detailed comparisons between the observations and Sim 1 and between Sim 2 and Sim 3 are reported in Sections 3.1 and 3.2, respectively. Figure 7 shows the monthly average of the NOx concentration at ground level (1.5 m height). In this Figure we can see that the domain includes several busy roads, characterized by an average NOx concentration between 90 and 120 µg/m 3 , and that the road next to the monitoring station is characterized by a significant traffic intensity with average NOx concentrations around 90 µg/m 3 . Since the dispersion model has a very high spatial resolution (3 m), the high spatial variability of NOx concentrations near the road traffic emission sources can be appreciated. Concentrations are usually higher near the street central axes and lower towards the borders, with sharp spatial gradients in line with similar modelling studies in urban street canyons [5]. Concentrations vary in a wide range in different parts of the domain reflecting local traffic flows. The comparison between the hourly observed NOx and the modelled concentrations of NOx at the Magna Grecia AQ station is shown in Figure 8, while in Figure 9 the absolute value of the daily fractional bias and the R-squared linear coefficient, whose definition is reported in the Appendix B, are shown. Here, we used the absolute value of the daily fractional bias as we were interested in discussing its magnitude rather that its sign. In general, Sim 1 showed a good agreement with the hourly NOx concentration variation ( Figure 8) with an average fractional bias of 0.2 across the period. This was confirmed by the daily fractional bias that showed values below 20% for half of the days (Figure 9) and only for 7 days above 30%. Higher discrepancies are indicated by low values of R-squared. Generally, for almost half of the simulated days, R-squared values larger than 0.4 were found.  The reasons behind the low correlations with a low fractional bias can be ascribed to the three chosen modulations that were not always the most representative of the vehicle flow variability in all the 29 days. These comparisons show nevertheless that the daily fractional bias was generally low, indicating that the background concentrations were fairly well transported by the model and that the reduced modulations used had little influence on most of the daily averages. Figure 10a shows a regression plot of the modelled vs. observed NOx concentrations relative to all the simulated periods, and a R-squared coefficient of 0.35, indicating a significant influence of the few days with very low correlations. Daily plots on selected days, with a mean fractional bias lower than 20%, in the same Figure 10b-d, show that the daily-averaged correlation could reach significant values.

PMSS Traffic Emission Simulation (Sim 1)
In general, there are no universally accepted rules to evaluate the performance of a model in the field of air quality. Usually, the metric to use and the acceptance levels to consider for such a task are matter of debate [58]. Recently, a growing number of studies have used metrics that are involved in the definition of general acceptance criteria for dispersion model evaluation, proposed by the authors of [59] for urban dispersion modelling. Among these studies, a recent paper [60], applied these criteria to evaluate the WRF-Chem NOx concentration simulations to use as background for a Micro Swift Spray simulation.
The acceptance criteria for dispersion model evaluation are based on the following metrics: Fractional bias (FB); Normalized Mean Squared Error (NMSE); Fraction of simulations within a factor of two of the measurements (FAQ2); and Normalized Absolute Difference (NAD). Their definitions, following the work in [59], are listed in Appendix B. A model will meet these acceptance criteria if the aforementioned indicators satisfy the values reported in Table A1 for urban and rural air quality stations.
The performances of our model are reported in Table 5.  The comparison between the performances of our model and the acceptance criteria indicated a good agreement between our simulation and the observations.
We also point out that Oldrini et al. 2017 [34] considered satisfying a simulation in which 68% of the predicted concentrations were within a factor of two, based on one of the same acceptance criteria cited previously, when we had 87%, as shown in Table 5. Figure 11 shows a quantile-quantile plot of the modelled vs. observed NOx concentrations, which indicates that both the measured and modelled values follow the same statistical distribution except for values higher than 110 µg/m 3 , when the model seemed to overpredict the NOx concentration. This feature was already noted in Villani et al., 2021 [61] who presented a comparison between the NOx concentrations simulated by PMSS and the concentrations observed in a street canyon in Modena during a field campaign. In that study, the Q-Q plot was very similar and the value at which the simulated and observed distributions started to significantly differ was above 50 µg/m 3 . At the moment, no clear explanation for these differences has been identified. The distribution of the differences between the observed and simulated NOx concentrations was generally normal, and slightly skewed towards positive values. Figure 12 shows the histogram of the hourly percentage difference combined with the cumulative probability of occurrence, indicating that more than 50% of the differences were between −10% and 20%. The R-squared values corresponding to some of these cases are shown in more detail in Figure 10. The results we show here are comparable to those obtained from the use of dedicated measured vehicle flow data during a field campaign [61].

Comparison between Emission Processors (Sim 2 and Sim 3)
In this section, we present the comparison between the two emission processors TRE-FIC and ECOTRIP. Figure 13 shows a general good agreement between the two simulations, except for a general overestimation of Sim 3 for high values of NOx concentrations. The percentage differences between Sim 2 and Sim 3 were mostly between −30% and 20% as shown in Figure A2 of Appendix A. On average, the percentage difference between the NOx concentrations of Sim 2 and Sim 3 changed with the time of the day. In Figure 14 we mapped the period average of hourly concentrations at 19:00, which represented typical high traffic conditions. We can observe that the percentage differences were often limited to values between −10% and 10% (Figure 14 panel b). On the other hand, and as expected, these differences could be significant at times where the NOx concentrations were lower. Therefore, depending on the aim of the study, the choice of the time modulation used as input in the data can have a significant impact on the simulated concentrations and the availability of an emission database with highly detailed hourly modulations possibly represents a significant advantage.
Once again, we find it necessary to point out that in these comparisons between Sim 2 and Sim 3, the NOx concentrations were generally lower than those shown in Section 3.1 for two reasons: fewer vehicle types were accounted for and no background evaluation was added to the final concentrations that were compared.

Conclusions
In this work, we applied FCD data for modelling NOx concentrations at a microscale level in Rome. We first used the hourly vehicle fluxes to generate our best estimate of the NOx concentration at the location of the Magna Grecia air quality station (Sim 1) from 2 May to 30 May of 2013. We made the assumption that three averaged time modulation profiles were sufficient to describe the variability within the database, with the purpose of comparing the simulated concentrations to the one measured in the air quality station. The comparison against the observation showed an acceptable agreement with the daily mean fractional bias below 20% for almost half of the days and above 30% for only 7 out of 29 days. This is an encouraging result that shows that we can successfully incorporate high resolution vehicle circulation data into our microscale modelling suite. We calculated the FB, NMSE, FAQ2, and NAD statistical indicators, the values of which satisfied the acceptance criteria [59] both for rural and urban air quality stations.
The innovative aspect of this work is the use of FCD computed emissions to feed our urban scale model. Although extremely appealing, the use of massive FCD to extract traffic patterns and travel behaviors in urban areas is still at an early stage compared to other sectors, particularly due to the fragmented availability of transport/mobility data, institutional barriers, and data privacy/security issues. As a consequence, the dataset used in this study was based on data collected by only 5% of the circulating passenger cars. Extrapolating this dataset to the entire fleet could represent an important approximation. For these reasons, 87% of the simulated values within a factor of two with respect to the measurements represents a remarkable success, even if the linear correlations were poor with only few days above 0.6.
Finally, the kind of agreement we found in this study was very similar to what we found in a recently published study using PMSS in the city of Modena, where traffic flows and NOx observations were available during a measurements campaign [61].
To have an indication of the effect of the simplified hourly profiles of vehicle flows on the simulated NOx concentrations, a test was made in which we used the FCD flows of passenger cars to compute the NOx emissions with TREFIC in the way described in Section 2.4.2 (Sim 2) and we compared the resulting NOx concentrations with the ones obtained using ECOTRIP emissions as input to PSPRAY (Sim 3). The added value of comparing Sim 2 (few modulation profiles for all the road links) vs. Sim3 (each road link with its own modulation) lies in the study of the influence of the modulation profiles used on the simulated NOx concentrations. These two simulations showed very similar concentrations at Magna Grecia and their differences were often between −30% and 20%. This further study allowed the direct implementation of this database into our microscale modelling suite, avoiding the assumption on the time modulations, and providing a valid tool to potentially enable the use of PMSS with emission inputs that can be provided in quasi-real time.
Future developments of this work could involve the use of more FCD data in our PMSS modelling chain. Moreover, if the availability of FCD data increases and is less fragmented, other pollutants and longer time scales could be explored. its staff [16]. CRESCO/ENEAGRID High Performance Computing infrastructure is funded by ENEA, the Italian National Agency for New Technologies, Energy and Sustainable Economic Development and by Italian and European research programs, see http://www.cresco.enea.it/english for information.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Appendix A
This section includes some material supplemental to the discussion in the main sections.

Appendix A.1. From ECOTRIP Data to PMSS Emission Input
To use the emissions calculated by ECOTRIP, it was necessary to adapt them to the PMSS input format. The information on the geographical domain and the position of each road segment included in the ENEA-SMTS database were coupled with the segments present in the PMSS input file usually used as PMSS emission input, containing the road segments with the associated hourly emissions. For each day of the simulations, a database containing details on the road segments and hourly emissions were compiled in the text format which was then processed to generate the emission input files for PSPRAY (see scheme in Figure A1). The procedure was defined using R-cran packages (base [55], lubridate [56], RGDAL [57]) and Fortran compiled libraries coded by ARIANET. The most significant difference in using the emission input generated with this procedure and the usual input generated by TREFIC, described in Section 2.4.2, was the possibility to use hourly emissions estimates for each road segment based on the traffic counts provided by the Octo Telematics without introducing any additional simplification on the modulation profiles. Figure A1. Scheme to illustrate the steps to create the files to use as input into PSPRAY. Figure A2 shows the percentage differences between Sim 3 and Sim 2 (i.e., (Sim 2-Sim 3) /Sim 2 × 100) with most of the values between −30% and 20% in agreement with what was found in Figure 14.

Appendix B
In this work, to test the acceptance criteria of [59] we used the definition of average Fractional Bias (FB), the Normalized mean square error (NMSE), the Fraction of simulated values within a factor of two of the observed value (FAC2), and the Normalized Absolute Difference (NAD) as: FAC2 fraction of data where 0.5 < C s C o < 2 , where Co and Cs are, respectively, the modelled and the observed concentrations. The acceptance criteria for dispersion model evaluation were defined for two categories of measurements stations based on their location. Hence, they were defined for rural and urban stations and are reported here in Table A1. A model will meet the acceptance criteria if these statistical indicators satisfy the values reported in Table A1 for urban and rural types of monitoring stations.
Furthermore, as a metric for the linear regressions, we used the R-squared, that in a x-y linear regression is calculated as the square of the linear correlation coefficient (R) and indicates the amount of predicted variability explained by the observed variability. The value of R is calculated by the cor.test function in the package R-cran [55] which refers to [62,63].