Managing Uncertainty in Urban Road Traffic Emissions Associated with Vehicle Fleet Composition: From the Perspective of Spatiotemporal Sampling Coverage

: With pronounced differences in emission factors among vehicle types and marked spatiotemporal heterogeneity of vehicle fleet composition, extrapolating fleet composition from insufficient sample hour periods and road segments will introduce significant uncertainty in calculating regional daily road traffic emissions. We proposed a framework to manage uncertainty in urban road traffic emissions associated with vehicle fleet composition from the perspective of spatiotemporal sampling coverage. Initially, the respective relationships of the temporal and spatial sampling coverages of fleet composition with the resulting regional daily road traffic emission uncertainties were determined, using the core area of a typical small and medium-sized city in China with the widely-used International Vehicle Emissions (IVE) model as example. Subsequently, function models were developed to explore the determination of the spatiotemporal sampling coverage of fleet composition. These results of emission uncertainties and function models implied that gases with larger emission factor discrepancies between vehicle types, such as NOx, required greater spatio-temporal sampling coverage than gases with smaller discrepancies, such as CO 2 , under the same uncertainties target. Therefore, sampling efforts should be prioritized for gases with larger emission factor discrepancies. Additionally, increasing sampling coverage in one dimension (either spatial or temporal) can reduce the minimum required coverage in the other dimension. To further reduce uncertainty, enhancing both spatial and temporal sampling coverage of the fleet composition is more effective than enhancing one type of coverage alone. The framework and results proposed in this work can reduce the uncertainty of emissions calculations caused by insufficient sampling coverage and contribute to more accurate transport emission reduction policy formulation.


Introduction
As a major source of urban air pollution and greenhouse gases, road traffic emissions are a significant consideration when implementing reduction policies for urban emissions [1][2][3].Quantification of urban road traffic emissions with low uncertainty is fundamental to effective emission reduction policies [4,5].Due to the high cost of the large-scale direct monitoring of vehicle exhaust, current research on urban road traffic emissions primarily significantly by several or even hundreds of times with parameters such as fuel type, emission standards, weight, and displacement [9,10].Moreover, the spatiotemporal dynamic characteristics of vehicle travel mean that each type of vehicle exhibits varying travel distance proportions across different hour periods and road segments [11].Thus, the differences in emissions per unit distance and dynamics of vehicle travel may bring significant uncertainty to the quantification of road traffic emissions [12].To reduce uncertainty in quantifying urban road traffic emissions, it is ideal to determine the proportion of the distance travelled for each vehicle type on the road (i.e., vehicle fleet composition) using the highest possible spatiotemporal sampling coverage within the specified spatiotemporal domain, followed by input into emission models for emissions calculations.
Due to the substantial expenses required to gather detailed information on the fleet composition on roads, previous studies often simplified the process of obtaining such data, which may bring significant uncertainty to the emission quantification results.Li et al. [13] assumed that all vehicles in the Nanning urban area were light-duty vehicles when quantifying traffic emissions.Additionally, many studies have estimated the fleet composition based on static information from government statistical yearbooks or local vehicle registration databases [14][15][16].However, the fleet composition cannot be expressed by a single-vehicle type or remains static.Due to the irrationality of these assumptions, other studies have started to use sampling methods to determine the fleet composition.For instance, Sun et al. [17] adopted a sampling survey approach, selecting the fleet composition from a survey of 40 roads as samples to represent the fleet composition for a total of 536 roads and quantified the NOx emissions for these roads.Meng et al. [18] used video cameras at 14 intersections to collect 8-h data on fleet composition, extrapolated the fleet composition across additional roads in the urban area of Chengdu and quantified the HC, NOx, and CO emissions in the urban area.Li et al. [19] collected data on the fleet composition for an hour (8:00 am-9:00 am) at 15 road monitoring sites evenly distributed across Macau and estimated the annual emissions of CO, CO2, PM, NOx, and VOC from road traffic in Macau.However, the studies above lacked a process for determining the spatiotemporal coverage of fleet composition sampling, which directly affects the uncertainty of the emission quantification results.To determine the spatiotemporal sampling coverage needed, it is necessary to first understand the relationship between the spatiotemporal sampling coverage of fleet competition and the resulting uncertainty.Subsequently, according to the aforementioned relationship and the uncertainty management target, reasonable spatial and temporal sampling coverages can be determined.Nevertheless, current research on the aforementioned relationship is lacking; thus, it is difficult to determine reasonable spatial and temporal sampling coverage of fleet composition.
In response to the aforementioned research gaps in existing studies, this study proposed a framework to manage uncertainty in urban road traffic emissions associated with vehicle fleet composition from the perspective of spatiotemporal sampling coverage.Initially, the respective relationships of the temporal and spatial sampling coverages of fleet composition with the resulting regional daily road traffic emission uncertainties were determined.Subsequently, based on the aforementioned relationships, function models were developed to explore the determination of the spatiotemporal sampling coverage of fleet composition to manage the emissions uncertainty associated with fleet composition.
The analysis in this study utilized the IVE model, which is a traditional emission quantification model and Xuancheng city, a representative small and medium-sized city, as an example.The IVE model has been widely used in road traffic emission quantification research in developing countries [20][21][22].Small and medium-sized cities, which constitute a significant proportion of Chinese cities and often experience rapid economic and emissions growth, will play an essential role in the construction of green cities in the future [23].

Materials and Methods
The overall research framework of this article is shown in Figure 1.(1) Using road network data, automatic licence plate recognition (ALPR) data, and vehicle registration data, this study acquired the parameters necessary for emissions calculations, such as traffic flows, fleet composition, and driving conditions, for each road segment and hour.( 2) Using the IVE model and Monte Carlo simulation, this study calculated the uncertainties associated with regional daily road traffic emissions for different spatial or temporal coverage levels and then analysed the variation patterns in uncertainties.(3) This study calculated the uncertainties associated with different spatiotemporal coverage combinations of regional daily road traffic emissions, and constructed a requirement model for the spatiotemporal sampling coverage of fleet composition for the quantification of regional daily road traffic emissions.

Research Area and Data Sources
In this study, the core urban area of Xuancheng city was selected as the research area for the following reasons.First, Xuancheng is one of the 27 central cities in the Yangtze River Delta, a typical small and medium-sized city in China experiencing rapid development (see Figure S1).Its population in 2018 was 2.64 million.From 2005 to 2018, the per capita gross regional product (GDP) of Xuancheng grew at an annual rate of 10.5%.Cities of this size will play an important role in the construction of green cities in the future [24].Second, the car ownership per capita in Xuancheng (0.15 vehicles/person) was close to the national average in China (0.166 vehicles/person) [25].The fleet composition in the core urban area of Xuancheng was similar to that in other small and medium-sized Chinese cities, such as Langfang [26], Yangquan [27], and Foshan [28], with light passenger cars accounting for more than 85% of the total traffic mileage, while the proportions of heavy passenger cars, light-duty trucks, and heavy-duty trucks were relatively low at approximately 2%, 5%, and 2%, respectively.Therefore, the fleet composition in Xuancheng is highly representative of that of other small and medium-sized Chinese cities.Third, the urban core area is often congested and densely populated, thus increasing the health risks associated with traffic pollution.Fourth, the central area of Xuancheng was densely equipped with ALPR detectors, providing data samples with high spatiotemporal sampling coverage for this study.Through the ALPR detectors, traffic flow and licence plate information for vehicles on corresponding roads can be obtained, thereby further supporting accurate emissions and uncertainty quantification.
The road segments covered by ALPR detectors in the study area are shown in Figure 2, and include 113 monitored road segments categorized into expressways, arterial roads, and local roads [28], with each category accounting for approximately one-third of the total road length.The coverage of ALPR detectors on expressways and arterial roads within the area was nearly 100%.The urban structure of the study area exhibits a typical radial pattern [29].Based on the aforementioned data, the traffic volume on each road segment every hour, the driving time of each vehicle in each segment, the average driving speed, etc., can be calculated.An example of the ALPR data is shown in Table 1.By correlating ALPR data with licence plate numbers in the Xuancheng vehicle registration database, we can obtain detailed technical attribute information for vehicles travelling on each road segment every hour (further explained in the Section 2.2).Consequently, the fleet composition can be determined for each road segment every hour.Due to the lack of technical information on unregistered vehicles in the registration database and the fact that 90% of the traffic volume in the study area involves registered vehicles, this study primarily focuses on registered vehicles.

Emission Quantification Method
This study used the IVE model to calculate hourly segment-level emissions [30].The IVE model, developed by the International Sustainable Systems Research Center (ISSRC) and University of California, Riverside (UCR), is known as an efficient tool for developing traffic emission inventories in developing countries [31] and has been widely used in China [29], India [21], and Iran [22].The fleet composition of each hour and road segment was composed of the proportion of mileage travelled by each vehicle type to the total mileage [17], with the proportion calculation formula shown in Equation (1).
where The emissions from each hour and road segment are obtained by summing the emissions for each vehicle type, and the corresponding formula as shown in Equation (2).Regional emissions can be calculated by summing all road segments, and daily emissions can be calculated by summing the hourly emissions.
where Eh,l,j is the emissions of gas j on road segment l during hour h, g/h;  ̅  is average speed of the LA4 driving cycle, which is set to 8.7 m/s; Bt,j is the basic emission factor of emission j of vehicle type t, g/veh/km; Kh,t,l,j is other correction factors (speed correction, oil correction, etc.) for emission j of vehicle type t, g/veh/km; and  ̅ ℎ, is the vehicle average driving speed of road segment l during hour h, m/s.Vehicle types in the IVE model are categorized into over 1300 types based on various technical attributes, including description, weight, fuel type, exhaust characteristics, and age (defined by cumulative mileage).In this study, the method for matching each vehicle's base emission factors and correction factors, based on vehicle registration data, refers to the approach described by Yu et al. [29].The comparison and examples of vehicle technical information used in the IVE model and from the Xuancheng registration database are shown in Table 2.This study focused on the uncertainty of NOx and CO2 emissions because of their importance for air quality and climate change [33,34].Furthermore, among common vehicle pollutants and greenhouse gases (VOC, CO, NOx, SO2, and CO2), the variability in emission factors among vehicle types for NOx and CO2 was the largest and smallest, respectively.Therefore, analysing NOx and CO2 aids in understanding the potential ranges and trends of uncertainties for common vehicle pollutants and greenhouse gases as sampling coverage changes.

Uncertainty Quantification Method
As regional daily total emissions are a key indicator involved in current emissions management policies, this research mainly focuses on the uncertainty of daily road traffic emissions at the regional level within the study area [35,36].Uncertainty is a lack of knowledge of the true value of a variable and can be described with a confidence interval characterizing the range of possible values [37].Current studies commonly use Monte Carlo simulations to calculate emission uncertainties [38][39][40].In this study, the emission uncertainties associated with differences in spatiotemporal sampling coverages were calculated via Monte Carlo simulation.
To provide general guidance for the existing studies, the spatiotemporal sampling coverage was defined based on current research [17][18][19].The specific examples are shown in Table S1.For temporal sampling, one hour was selected as the smallest sampling unit, and stratified sampling was conducted based on daytime and nighttime.For spatial sampling, given that fleet composition samples are typically obtained based on a certain number of road segments or intersections, a single road segment between two adjacent intersections was selected as the smallest sampling unit.Stratified spatial sampling was conducted based on road type (the road type was detailed in the Section 2.1).According to these settings, temporal sampling coverage refers to the proportion of sampled hours (e.g., 2 h) to the total number of hours (24 h in this study) representing the full period for emissions quantification.Spatial sampling coverage refers to the proportion of the number of sampled road segments to the total number of road segments in the study area.
The Monte Carlo method to calculate uncertainty for a specified spatiotemporal sampling coverage combination (which refers to sampling with a specific spatial coverage and temporal coverage) involves the following steps.First, T-fold random sampling simulations were performed.In this study, T was set to 1000 [41,42].The sample size of each sampling simulation was determined based on the spatiotemporal sampling coverage and sampling population.Second, the fleet composition for the time periods and road segments included in the sample were calculated via Equation (1).Third, in each simulation, the fleet composition of hours and road segments not included in the sample were extrapolated based on the samples already collected, the calculation method is shown in Equation (3).
where  ,ℎ * , * is the proportion of VKT by vehicle type t on road segment l* during hour h*, inferred from the sample, l* refers to the road segment to be inferred, and h* refers to the hour period to be inferred.Fourth, the regional daily road traffic emissions were calculated for this simulation via Equation (2).Finally, the emission calculation results from the 1000-fold simulations for a single spatiotemporal sampling coverage combination were used to compute a 95% confidence interval to represent uncertainties.

Method for Constructing the Requirement Model for the Spatiotemporal Sampling Coverage of Fleet Composition
The goal of this section was to develop a method to provide guidance for sampling fleet composition data for quantifying daily road traffic emissions in the core urban area.First, it was essential to determine the respective relationships of the temporal and spatial sampling coverages of fleet composition with the resulting uncertainties in emission quantification.Calculating uncertainties for all possible combinations of spatiotemporal coverage using Monte Carlo simulation requires substantial computational effort.Therefore, it is necessary to construct function models that approximates the relationship between spatiotemporal coverage and regional daily road traffic emission uncertainties based on a series of experiments.In each of these experiments, the uncertainties were measured for specified spatiotemporal coverage settings.
To effectively construct the function models, we first determined the patterns of uncertainty variations for different typical spatial sampling coverages (assuming that the temporal sampling coverage is constantly 100%) and for temporal sampling coverages (assuming that the spatial sampling coverage is constantly 100%) in sequence.The sample collection scheme was designed based on methods found in the current research [17][18][19], as presented in Table S1.Second, to intuitively reflect the patterns of variations, we fitted functions with emission uncertainties as the dependent variable and spatial/temporal sampling coverage as the independent variables.Third, for intervals of temporal or spatial sampling coverage in which the change in uncertainties was significant, the number of spatiotemporal sampling coverage combinations was increased to more effectively reflect these changes.Finally, based on the results of the uncertainties associated with different combinations of spatiotemporal sampling coverage, we constructed a model that delineated the relationship between assorted combinations of spatiotemporal sampling coverage and their resultant uncertainties.The selection of an appropriate fitting method for this model hinges on the pattern analysis of variations delineated in the second step of the aforementioned.
After the requirement model was constructed, the minimum spatiotemporal sampling coverage requirements can be determined by inputting the targeted uncertainties into the model.The required sample size can be obtained by multiplying the minimum spatiotemporal sampling coverage determined by the requirement model and the total population.If the sample size result was a noninteger, it should be rounded up to the nearest (higher) integer to adhere to the principle of conservatism.
Given the similarity of the fleet composition in the study area to that of other small and medium-sized cities in China (as detailed in the Section 2.1), the model constructed has a general degree of applicability in such cities.

Variation Patterns in Regional Daily Road Traffic Emission Uncertainties with Changes in Spatial Sampling Coverage
The uncertainty (expressed as 95% confidence intervals) quantification results of NOx and CO2 emissions on weekdays and weekends under different spatial sampling coverage levels are shown in Figure 3. Considering the similarity in the distribution of uncertainties between weekdays and weekends, the discussion below was based on the corresponding average values.With a temporal sampling coverage of 100%, as the spatial coverage of the samples increased, the uncertainties gradually and monotonically decreased towards zero.For NOx, in the case of a spatial sampling coverage of 5%, the emission uncertainties were −36~52%.When the spatial sampling coverage increased to 35% and 70%, the uncertainties decreased to −14~15% and −5~5%, respectively.In contrast, the uncertainties of CO2 for the same spatial sampling coverage were smaller than that of NOx; notably, at a spatial sampling coverage of 5%, the uncertainties were −3~4%.The uncertainties of NOx emissions exceeded those of CO2 emissions when considering the same spatial sampling coverage.This is because there are more significant variations in NOx emission factors among vehicle types than those for CO2, and there are distinct differences in fleet composition among road segments.For example, for road Segments 1 and 2 in this study area (both expressways), the vehicle type with the highest NOx emission factor was the heavy-duty old-age (Vehicle age is defined by cumulative mileage; the cumulative mileage of old-age vehicles is <79 K km, that of young-age vehicles is 79-161 K km, and that of new-age vehicles is >161 K km [43]) vehicles using China III diesel (referred to as Type A), and that with the lowest was the light-duty new-age vehicles using China IV gasoline (referred to as Type B), with the former's base emission factor being 136 times that of the latter [43].Within Segment 1, the VKT proportions of Type A and Type B were 0.13% and 20%, respectively, while in Segment 2, they were 1.8% and 19%, respectively.Therefore, estimating the fleet composition of Segment 1 based on that of Segment 2 could result in an overestimation of the VKT proportion of Type A and an underestimation of the VKT proportion of Type B, thereby significantly overestimating the emissions for Segment 1 beyond the actual values.In contrast, the difference between the highest and lowest CO2 emission factors for different vehicle types on road Segments 1 and 2 was only five times.This means that the impact of variations in fleet composition among segments on the emissions calculation has less effect on gases with small differences in vehicle emission factors than gases with large differences.Therefore, to achieve the same uncertainty goal, the spatial sampling coverage of gases with large emission factors differences between vehicle types needs to be larger than that of gases with small differences.
Additionally, as the spatial sampling coverage increased, the magnitude of change in the corresponding uncertainties gradually decreased.For instance, when the spatial sampling coverage increased from 5% to 20%, the upper bound of NOx uncertainty decreased by 27% points and the lower bound increased by 12% points.However, when spatial sampling coverage increased from 20% to 50%, the upper bound decreased by only 13% points, and the lower bound increased by only 10% points.This shows a trend that the benefit of reducing uncertainty diminishes as the spatial sampling coverage increases.Meanwhile, the uncertainties at low spatial sampling coverage showed significant asymmetry, with greater upper bound than the lower bound's absolute value.This phenomenon was caused by the non-negative characteristics of emissions and the strong influence of outliers of fleet composition under low spatial sampling coverage.This asymmetry decreased as spatial sampling coverage increased, and starting at 35%, the absolute values of the upper and lower bounds were almost the same, differing by less than 0.5% points.Thus, with lower spatial sampling coverage, the degree of overestimation in emission quantification will be greater than that of underestimation.If the tolerance of overestimation of emission results is lower than that of underestimation, the spatial sampling coverage should be further improved.

Variation Patterns in Regional Daily Road Traffic Emission Uncertainties with Changes in Temporal Sampling Coverage
The uncertainty quantification results of NOx and CO2 emissions on weekdays and weekends at different temporal sampling coverage levels are shown in Figure 4.The uncertainties distributions for weekdays and weekends were similar; thus, the discussion below was based on the corresponding average values.With 100% spatial sampling coverage, the overall trend was that the uncertainties gradually and monotonically decreased, approaching zero as the temporal sampling coverage increased.For NOx, at a temporal sampling coverage of 10%, the emission uncertainties were −22~25%.When the temporal sampling coverage increased to 25% and 60%, the uncertainties decreased to −13~15% and −5~5%.In contrast, the uncertainties of CO2 at the same temporal sampling coverage were less than that of NOx, with uncertainties of −2~2% at a temporal sampling coverage of 10%.The reason why NOx emission uncertainties significantly exceeded those of CO2 at the same temporal sampling coverage was similar to the reasoning introduced in the Section 3.1.The fleet compositions from 08:00-09:00 and 13:00-14:00 were used as examples.During these periods, the vehicle type with the highest NOx emission factor was heavyduty old-aged vehicles using China II diesel (referred to as Type C), and the vehicle type with the lowest NOx emission factor was light-duty new-aged vehicles using China IV gasoline (referred to as Type D), with the former's base emission factor being 170 times that of the latter.From 08:00-09:00, the VKT proportions of Type C and Type D were 0.01% and 17%, respectively, whereas from 13:00-14:00, they were 0.05% and 16%, respectively.Consequently, inferring the 07:00-08:00 fleet composition from the 12:00-13:00 fleet composition could lead to overestimation and underestimation of the traffic activity for Type C and Type D vehicles, respectively.In contrast, the difference between the highest and lowest CO2 emission factors for different vehicle types at 08:00-09:00 and 13:00-14:00 was only five times.This inference would lead to a significant overestimation of emissions from 08:00-09:00 compared to the actual values.
As the temporal sampling coverage increased, the magnitude of change in the corresponding uncertainties gradually decreased.For instance, as the temporal sampling coverage increased from 10% to 25%, the upper bound of NOx uncertainty decreased by 11% points, and the lower bound increased by 8% points.However, as the temporal sampling coverage increased from 25% to 50%, the reduction in the upper bound was only 7% points, and the increased in the lower bound was only 6% points.This reflects the fact that the benefit of reducing uncertainty diminishes over time as temporal sampling coverage increases.Moreover, at low temporal sampling coverage, the uncertainties exhibit noticeable asymmetry.This asymmetry diminishes with increasing temporal sampling coverage, becoming almost identical in absolute value from 50% onwards, with a mere difference of 0.5% points.This is due to the non-negative characteristics of emissions and the significant impact of outliers in fleet composition when temporal sampling coverage is low.Therefore, the degree of overestimating emission quantification results with low temporal sampling coverages will be greater.If the tolerance of overestimation of emission results is lower than that of underestimation, the temporal sampling coverages should be further improved.
These findings provide guidance for the construction of a requirement model for the spatiotemporal sampling coverage of fleet composition.The suggestion is to use more uncertainties data at lower, and less uncertainties data at higher sampling coverages.This approach ensures that the model accurately represents the range of change and asymmetry on the upper and lower bounds of confidence intervals of uncertainties.

Construction of a Requirement Model for the Spatiotemporal Sampling Coverage of Fleet Composition
According to the analysis detailed in the Sections 3.1 and 3.2, the variance in the emission uncertainties between weekdays and weekends was small when considering identical temporal or spatial sampling coverages.Therefore, data from weekdays and weekends were simultaneously used to construct a requirement model for the spatiotemporal sampling coverage of fleet composition for regional daily road traffic emissions.
Due to the significant nonlinear relationships between uncertainties and temporal and spatial sampling coverages, this study employed multivariate nonlinear regression to construct a spatiotemporal sampling coverage requirement model.Multivariate nonlinear regression can flexibly model complex nonlinear relationships among variables, and offers an objective mathematical framework [44].Owing to the significant variation and asymmetry in emission uncertainties at low spatiotemporal sampling coverages, three coverages below 30% and two above 30%, namely, spatial sampling coverages of 2%, 10%, 21%, 50%, and 79% and temporal sampling coverages of 8%, 16%, 25%, 50%, and 80%, respectively, were selected.This selection creates 25 different spatiotemporal coverage combinations for uncertainty calculation.The results of these calculations are shown in Table S2.
Based on the uncertainty calculations for 25 spatiotemporal sampling coverage combinations, the requirement model was constructed using polynomial functions in this study.To determine the order of the polynomial model, significance test indicators (F value and p value) and the Bayesian information criterion (BIC) method were used.Although the third-order model displayed the best performance for all indicators, it was nonmonotonic within the value range, which does not comply with the data distribution pattern shown in the Sections 3.1 and 3.2.The significance test of the second-order model yielded an F value of 425, a p value < 0.0001, and a BIC of −344, yielding results with greater significance and a better fit than those of the first-order model.Therefore, a second-order polynomial was adopted as the regression model, as shown in Equation (4).The normalized mean squared error (MSE) and correlation coefficient (R 2 ) are adopted as criteria for evaluating the performance of the model.The MSE measures the average square difference between the experimental results and model predictions, with values closer to 0 indicating higher model precision.The R 2 measures the correlation between the experimental results and model predictions, with R 2 values greater than 0.9 generally indicating high accuracy.The results of the coefficients, R 2 values and MSE values as shown in Table 3.The R 2 values of the second-order regression models were all greater than 0.985, and the MSE values were less than 5 × 10 −4 .To validate the models reliability, an additional ten sets of random spatiotemporal sampling coverage combinations and corresponding emission uncertainties were used to test the model [45], with R 2 > 0.93 and MSE < 0.0001.The aforementioned indicators suggested that the models have strong explanatory power and can accurately fit actual data.The spatiotemporal sampling coverage requirements corresponding to different upper and lower bounds of NOx and CO2 uncertainty predicted by the requirement model are shown in Figure 5.Each point on the contour lines reflects a combination of spatiotemporal sampling coverage at the same level of uncertainty.The spatial and temporal sampling coverages interact in such a way that increasing one can reduce the minimum requirement for the other.For instance, to maintain the upper bound of NOx uncertainty within 15%, the recommended spatiotemporal sampling coverage should be no less than 30% for temporal and 22% for spatial aspects.As the temporal sampling coverage increases to 35%, 45%, and 55%, the corresponding minimum spatial sampling coverages should be no less than 42%, 34%, and 31%, respectively.The contour lines were almost paralleled to the y-axis in the area when the spatial sampling coverage was low and the temporal sampling coverage was high.This phenomenon indicated that significant changes in temporal sampling coverage do not greatly affect uncertainties in this region.Taking the upper bound of NOx uncertainty as an example, when the spatial sampling coverage was 8% and the temporal sampling coverage was 33%, the upper bound was 50%.When only the temporal sampling coverage was increased to 100%, the upper bound decreased to 47%, a reduction of only 3% points.A similar phenomenon was also observed in areas with low temporal sampling coverage.This pattern indicated that compared to increasing only one type of coverage, the combined enhancement of both spatial and temporal sampling coverages can more effectively reduce uncertainty.The contour lines were densely distributed in areas of low spatial and temporal sampling coverages (coverage below 30%), indicating that the impact of spatiotemporal sampling coverage on uncertainties were significant in these areas.This phenomenon implied that enhancing sampling coverage in areas with initially low spatiotemporal density has a more significant impact on reducing uncertainty than in areas with already high coverage.Researchers can determine the spatiotemporal sampling coverage corresponding to the upper and lower bounds of acceptable uncertainty for various gases and then select the most stringent spatiotemporal sampling coverage for fleet composition data.For example, to achieve the NOx uncertainties target of ±15%, the temporal and spatial sampling coverages corresponding to the upper and lower uncertainty bounds of +15% and −15%, respectively, were intersected.The results suggested that the recommended minimum temporal and spatial sampling coverages should be no less than 31% and 22%, respectively.The possible combinations of the least spatial and temporal sampling coverages included 43% and 36%, 34% and 45%, etc.Similarly, to achieve the CO2 uncertainties target of ±3%, the recommended minimum temporal and spatial sampling coverages should be no less than 11% and 10%, respectively.The possible combinations of the least spatial and temporal sampling coverages included 17% and 16%, 15% and 18%, etc.These data showed that despite a lower uncertainties target for NOx compared to CO2, calculating NOx emissions necessitated greater spatiotemporal sampling coverage due to higher variability in NOx emission factors among vehicle types.Based on these findings, it can be generalized that under the same uncertainties targets, gases with larger emission factor discrepancies among vehicle types, such as NOx, necessitate greater spatiotemporal sampling coverages than those required for gases with smaller discrepancies, such as CO2.Therefore, sampling efforts should be prioritized for gases with larger emission factor discrepancies.
The framework and conclusions proposed in this study provide relevant insights on minimum threshold of spatiotemporal sampling coverage of fleet composition in calculating urban road traffic emissions, which assist in guiding fleet composition sampling, such as the development of manual sampling schemes and the installation of urban traffic cameras.Research conducted in urban areas similar to the case described in this work can employ the function models or the framework proposed in this work to obtain guidance on the spatiotemporal sampling coverage for fleet composition.These practices help avoid insufficient sampling and sampling that is excessively based on mere experience, thereby reducing uncertainty in emissions calculations.Moreover, emission data with reduced uncertainty can more accurately depict the severity and reduction potential of regional road traffic pollution and carbon emissions.These data thus support the selection of more precise and appropriate emission reduction measures.
The findings of this work are based on the use of IVE models in typical small and medium-sized Chinese city.Direct application of these results to studies in other types of urban areas or those employing different emission models may not yield accurate outcomes, especially in developed countries and large cities.Nevertheless, our generic framework lays a solid foundation for adaptation and application in diverse settings.Future research could apply this approach in various regions and with different models to test its relevance and adaptability, enhancing the robustness and utility of our findings in broader emission quantification efforts.

Conclusions
Given the absence of studies exploring the relationship between different spatiotemporal sampling coverages of vehicle fleet composition and emission uncertainties, it was difficult to determine a reasonable spatiotemporal sampling coverage to accurately quantify regional daily road traffic emissions.In response to the aforementioned research gaps in existing studies, this study proposed a framework to manage uncertainty in urban road traffic emissions associated with vehicle fleet composition from the perspective of spatiotemporal sampling coverage.This study was conducted in the core urban area of Xuancheng, a typical small and medium-sized city in China, using the widely applied IVE model.Initially, the respective relationships of the temporal and spatial sampling coverages of fleet composition with the resulting regional daily road traffic emission uncertainties were determined.Subsequently, function models were developed to explore the determination of the spatiotemporal sampling coverage of fleet composition.
These results of emission uncertainties and function models implied that gases with larger emission factor discrepancies between vehicle types, such as NOx, required greater spatiotemporal sampling coverage than gases with smaller discrepancies, such as CO2.Therefore, sampling efforts should be prioritized for gases with larger emission factor discrepancies.Additionally, increasing sampling coverage in one dimension (either spatial or temporal) can reduce the minimum required coverage in the other dimension.To achieve the NOx uncertainties target of ±15%, the recommended minimum temporal and spatial sampling coverages should be no less than 31% and 22%, respectively.The possible combinations of the least spatial and temporal sampling coverages included 43% and 36%, 34% and 45%, etc.To achieve the CO2 uncertainties target of ±3%, the recommended minimum temporal and spatial sampling coverages should be no less than 11% and 10%, respectively.The possible combinations of the least spatial and temporal sampling coverages included 17% and 16%, 15% and 18%, etc.To further reduce uncertainty, enhancing both spatial and temporal sampling coverage of the fleet composition is more effective than enhancing one type of coverage alone.
This study provided a reference tool for determining the spatiotemporal sampling coverage of fleet composition for cities with similar fleet composition and vehicle scales.The tool aids in avoiding insufficient sampling, thereby reducing uncertainty in regional daily road traffic emission quantification.This reduction in uncertainty contributes to the development of more precise and effective emission reduction policies, ultimately leading to a reduction in greenhouse gas and air pollutant emissions from urban road traffic and fostering sustainable economic and social development.

Figure 2 .
Figure 2. The location and road network of the study area in this paper.The ALPR dataset in this study comprises approximately 40 million records obtained from 10 May to 9 June 2018.The types of data collected include the detector location ID, licence plate number, and detection time of each vehicle passing by a specific detector.Based on the aforementioned data, the traffic volume on each road segment every hour, the driving time of each vehicle in each segment, the average driving speed, etc., can be calculated.An example of the ALPR data is shown in Table1.

Figure 3 .
Figure 3.At different spatial sampling coverage levels, the uncertainties of NOx (a) and CO2 (b) emissions on weekdays and weekends.

Figure 4 .
Figure 4.At different temporal sampling coverage levels, the uncertainties of NOx (a) and CO2 (b) emissions on weekdays and weekends.
where f(x1,x2) is the upper or lower bound for the 95% confidence interval of uncertainty; x1 and x2 are the spatial and temporal sampling coverages, respectively; and a, b, c, d, e, and f are the coefficients of the terms in the polynomial.Since the relationships between f(x1,x2) and x1 and x2 are nonlinear, this study performed a logarithm transformation on x1 and x2, thereby enabling the model to capture the relationships between f(x1,x2) and x1 and x2 more accurately.

Figure 5 .
Figure 5. Contour plots of spatial and temporal sampling coverages with upper and lower bounds of NOx (a) and CO2 (b) uncertainty.

Record Number Detector Location ID Licence Plate Number (Anonymized) Detection Time
Pt,h,l is the proportion of vehicle kilometres travelled (VKT) by vehicle type t on road segment l during hour h; Lt,h,l is the VKT of vehicle type t in road segment l during hour h, m; Lh,l is the VKT of all vehicles in road segment l during hour h, m; Qt,h,l is the traffic volume of vehicle type t in road segment l during hour h, veh/h; Qh,l is the traffic volume of road segment l during hour h, veh/h; and Dl is the length of road segment l, m.

Table 2 .
Comparison and examples of vehicle technical information used in the IVE model and from the Xuancheng registration database.

Table 3 .
Second-order models regression results and evaluation indicators.