1. Introduction
Road freight transport is a key target for pollution reduction, carbon mitigation, and refined environmental governance in the transport sector. Among freight vehicles, heavy-duty trucks make a disproportionate contribution to road transport emissions because of their large payloads, long travel distances, strong interregional mobility, and complex operating conditions. As a result, they have become a priority for identification and targeted intervention in transport-related environmental management. More importantly, heavy-duty truck emissions are not evenly distributed across time and space, but often show marked heterogeneity and spatial concentration. Previous studies have reported clear differences between local and transit trucks in activity patterns and road use [
1,
2], with transit trucks in some cases emerging as a dominant regional emission source. High-emission clusters are also frequently observed along port collection and distribution corridors, major expressway corridors, and road segments within key functional areas [
3,
4]. Accordingly, as transport governance shifts from aggregate control toward process-based regulation and targeted intervention, management needs have moved beyond annual emission inventories to the dynamic identification of emission sources, hotspot detection, and differentiated governance at the vehicle, road segment, and regional levels.
The scope of this review is deliberately limited to trucks, especially heavy-duty freight vehicles. This focus is justified because trucks differ from passenger cars and buses in payload variation, long-distance and interregional movement, diesel powertrain dominance, after-treatment operation, corridor concentration, and sensitivity to road grade and congestion. These characteristics make truck emissions a distinct problem for dynamic estimation and environmental governance. Passenger cars, buses, and mixed fleets are therefore not excluded as unimportant, but are treated as adjacent research objects that require separate comparison in future work.
Truck emission estimation has long relied on emission factor methods and conventional models such as COPERT, MOVES, and IVE [
5,
6,
7,
8]. These approaches remain important for regional inventory development, aggregate accounting, and policy scenario analysis, given their relatively mature frameworks, clearer parameter structures, and practical applicability. However, most of them take average velocity, typical operating conditions, or predefined emission factors as core inputs and are therefore better suited to macro-level estimation. Their limitations become more apparent when the focus shifts to dynamic regulation at the vehicle or road-segment level under high spatiotemporal resolution. For heavy-duty trucks, low-velocity congestion, frequent stop-and-go operation, road grade variation, payload changes, and fluctuations in after-treatment performance may all substantially affect emission formation. These complex responses are difficult to capture through average activity levels and static factors alone. Thus, conventional models remain valuable, but under refined environmental governance, they are no longer sufficient on their own for high-resolution representation of real-world operating processes and governance support.
In recent years, the growing availability of multi-source data, including GPS trajectories, on-board diagnostics (OBD), on-board monitoring (OBM), portable emissions measurement system (PEMS) measurements, traffic flow monitoring, and road network attributes, has provided a new technical basis for reconstructing truck emission processes and supporting dynamic estimation. Studies using real-world trajectories and vehicle activity data have begun to reveal the spatiotemporal patterns of heavy-duty truck emissions from the perspectives of local-transit differences, port-corridor concentration, and multi-scale spatial heterogeneity [
9,
10]. At the same time, research drawing on OBD/OBM data, remote monitoring, and interpretable machine learning is advancing high-emission vehicle identification, transient emission prediction, and fuel consumption estimation toward more continuous and scalable approaches [
11,
12,
13]. Even so, most existing studies still focus on a single data source, a single model, or a specific local setting. Although progress has been made in observing or predicting emissions, a clearer integrative framework is still lacking for how multi-source data can be translated into consistent model inputs, how key mechanism-related variables should be incorporated, and how estimation results can better support specific governance tasks. In other words, the central challenge is no longer simply a lack of data or methods, but how to build a reliable linkage among multi-source sensing, mechanism-based representation, and governance-oriented application.
Against this background, this paper does not simply provide a parallel summary of existing studies. Instead, it examines dynamic truck emission estimation from the perspective of practical governance needs, with a focus on how multi-source data can support credible estimation, how key features can represent emission formation mechanisms, how physical constraints can be incorporated into truck emission models, and how estimation results can be translated into specific governance tasks. The review is organized around four aspects: multi-source data support, key feature extraction, physics-constrained truck emission modeling, and major governance-oriented application scenarios. It also discusses current challenges related to data consistency, model generalizability, mechanism representation, and practical application. On this basis, this paper seeks to clarify the research logic through which truck emission studies are moving from emission accounting toward governance support, and to provide a more systematic reference for high-emission vehicle regulation, hotspot unit identification, policy comparison, and transport optimization.
2. Literature Search and Analysis Methods
To systematically review research progress in dynamic truck emission estimation, this paper retrieved and screened the relevant domestic and international literature. The main databases included Web of Science Core Collection and ScienceDirect. In addition, backward reference checking and supplementary manual screening were conducted to identify highly relevant studies, methodological references, and official model documents that were not fully captured by the database search.
The database search was conducted on 11 May 2026. The search period was set to 2014–2026, while earlier foundational studies were retained when they were necessary for explaining conventional emission models and their technical evolution. The search strategy was designed around three groups of terms: vehicle type, emission object, and dynamic estimation method.
For the Web of Science Core Collection, the topic search used the following combination: truck* OR “heavy-duty vehicle*” OR “heavy duty vehicle*” OR “freight vehicle*” OR “heavy-duty diesel vehicle*”; emission* OR CO2 OR CO2 OR NOX OR NOX OR PM OR “fuel consumption” OR “energy consumption”; and “dynamic estimation” OR “emission inventory” OR “remote sensing” OR OBD OR OBM OR PEMS OR GPS OR “machine learning” OR “deep learning” OR “data fusion”. For ScienceDirect, because the platform is less suitable for long Boolean strings, a simplified search string was used: truck emissions, dynamic estimation, GPS, OBD, PEMS, and machine learning.
After applying year, document-type, and language filters, 488 records were obtained from Web of Science Core Collection, and 16 records were obtained from ScienceDirect. Duplicate records and records outside the scope of truck emission estimation, dynamic monitoring, modeling, or governance applications were removed during title/abstract and full-text screening. The final cited set contained 72 core and additional references. Because this article is an integrative review rather than a meta-analysis, the screening procedure was used to ensure transparency and thematic coverage rather than to calculate pooled effect sizes. The detailed literature search and screening process is summarized in
Table 1.
To clarify the contribution of the present review, we further compared its focus with previous review streams. Existing reviews often emphasize emission inventory methods, conventional emission models, machine-learning prediction, or road-traffic CO
2 modeling separately. In contrast, this review organizes the literature through a governance-oriented chain linking multi-source data, feature construction, physics-constrained modeling, and decision-support applications for heavy-duty trucks. The positioning of this review relative to previous review streams is summarized in
Table 2.
In the screening process, this review primarily included three categories of studies: research on emission estimation for trucks or heavy-duty vehicles; studies involving dynamic inventories, real-world road monitoring, remote-sensing-based identification, or multi-source data fusion; and representative work on emission modeling, key feature extraction, and environmental management applications. Studies with limited relevance to the topic, focusing only on passenger cars, lacking methodological detail, or offering limited support for governance-related applications were not treated as core references. Based on the content of the retrieved literature, existing research can be broadly grouped into six themes: dynamic emission inventories and spatiotemporal distribution analysis based on GPS trajectories, traffic flow, and road network attributes; real-world emission identification using OBD/OBM data, PEMS measurements, remote sensing, and onboard telematics; applications and suitability assessment of conventional models such as COPERT, MOVES, IVE, and modal models; emission and fuel consumption prediction based on machine learning and deep learning; mechanism-oriented analyses incorporating factors such as payload, road grade, specific power, and engine operating conditions; and governance-oriented studies that use estimation results for high-emission vehicle identification, hotspot unit detection, policy comparison, and transport optimization.
Table 3 summarizes these categories and their analytical focus.
To avoid ambiguity, the main abbreviations used in this review are defined here. OBD denotes on-board diagnostics, which provide diagnostic and operating information from vehicle electronic systems. OBM denotes on-board monitoring, referring to continuous on-board or remote monitoring of emission-related and engine-operation signals. PEMS denotes a portable emissions measurement system for measuring emissions under real-world driving conditions. OD denotes origin-destination information. VSP denotes vehicle-specific power, which is commonly used to represent instantaneous power demand normalized by vehicle mass. EF denotes emission factor. COPERT refers to the European road transport emission inventory model; MOVES refers to the Motor Vehicle Emission Simulator; and IVE refers to the International Vehicle Emissions model. RF denotes random forest, XGBoost denotes extreme gradient boosting, LightGBM denotes light gradient boosting machine, and LSTM denotes long short-term memory neural network.
The final literature set was therefore used not only to summarize individual studies but also to support the four objectives of this review: identifying the data foundations of dynamic truck emission estimation, summarizing mechanism-related feature extraction, comparing emission modeling pathways, and clarifying governance-oriented applications.
Table 3 shows that existing research is not centered on a single method, but has gradually developed into an interconnected framework spanning activity reconstruction, monitoring-based identification, model development, and governance-oriented application. Different categories of studies place different emphasis on research objects, data conditions, analytical scales, and application goals. This suggests that the following review should move beyond simple classification and adopt a more integrated analytical framework.
Based on the above screening principles and thematic distribution, this paper does not simply summarize the literature by topic. Instead, the following analysis is organized around four closely related questions: how multi-source data can be translated into unified inputs for dynamic estimation; how key features can move from surface-level state variables to explanatory variables linked to emission formation mechanisms; how physical constraints can be incorporated into truck emission modeling to improve result credibility; and how estimation results can better support specific governance tasks, including high-emission vehicle regulation, hotspot unit identification, and comparison of emission control strategies. Accordingly, the remainder of the paper is structured around four aspects: multi-source data support, key feature extraction, physics-constrained truck emission modeling, and major governance-oriented application scenarios, forming a progressive analytical chain from data to features, models, and governance. On this basis,
Figure 1 presents the overall analytical framework of the review.
As shown in
Figure 1, dynamic truck emission estimation should not be viewed as a single-model problem, but rather as a progressive research process spanning data acquisition, feature construction, model development, and governance-oriented application. The following sections are therefore organized around these four core questions to provide a more problem-oriented, systematic, and coherent review framework. As a review article, this manuscript does not present experimental results in a separate Results section. Instead,
Section 2 describes the literature search and analysis methods,
Section 3,
Section 4 and
Section 5 present the synthesized review findings on data foundations, feature extraction, modeling pathways, and governance applications, and
Section 6 discusses remaining challenges and future research directions.
3. Multi-Source Data Foundations
The foundation of dynamic truck emission estimation depends not only on the choice of emission model, but also on whether multi-source heterogeneous data can be used to reconstruct vehicle operations and the conditions under which emissions are generated with reasonable fidelity. Unlike conventional approaches based on average activity levels and static emission factors, governance-oriented dynamic estimation requires the integration of multiple types of information, including vehicle status, activity patterns, traffic operation, and road environment, in order to capture emission differences across time, space, and operating conditions. Existing studies suggest that the continued accumulation of data, such as GPS trajectories, OBD/OBM data, PEMS measurements, traffic flow monitoring, and road network attributes, has created a practical basis for extending truck emission research from aggregate estimation to dynamic estimation at higher spatiotemporal resolution.
The value of multi-source data lies not in simply increasing the amount of information, but in transforming raw observations from different sources, scales, and semantic meanings into unified inputs for emission estimation and management analysis through preprocessing, spatiotemporal alignment, data integration, and feature construction. On this basis, dynamic truck emission estimation can be understood as an integrated analytical framework that progresses from multi-source data foundations to data processing and feature construction, dynamic estimation models, and governance-oriented applications. Its overall logic is shown in
Figure 2.
3.1. Data Sources
Dynamic truck emission estimation relies on the coordinated support of multi-source heterogeneous data. Existing studies generally draw on three main categories of data: vehicle status monitoring data, activity trajectory and traffic operation data, and environmental and vehicle attribute data. These data types play different roles in the estimation chain. The first is used to characterize real-world operating conditions and emission responses, the second to reconstruct truck activity patterns and their spatiotemporal distribution, and the third to explain and adjust emission differences under varying road environments, vehicle technologies, and operating contexts. This data foundation has enabled truck emission research to move beyond static accounting based on average activity levels toward dynamic estimation grounded in real-world operating processes.
Vehicle status monitoring data mainly includes OBD/OBM data, PEMS measurements, and remote emission monitoring. These data can record vehicle velocity, engine operating conditions, fuel consumption, and emission responses at a relatively high frequency, providing an important basis for identifying transient emission characteristics and calibrating estimation models [
14,
15,
16,
17,
18,
19,
20,
21,
22,
23]. Among them, PEMS is better suited to obtaining high-accuracy real-world emission data, whereas OBD/OBM data, with broader coverage, are more suitable for long-term monitoring and high-emission vehicle screening. Activity trajectory and traffic operation data mainly include GPS trajectories, toll gantry records, OD information, traffic surveys, and traffic flow monitoring. Their main value lies not in directly measuring emissions, but in reconstructing truck activity processes across roads and time periods [
24,
25,
26,
27,
28], thereby supporting dynamic inventory development, hotspot identification, and route analysis.
Environmental and vehicle attribute data mainly include road class, road grade, velocity limits, weather conditions, vehicle type, fuel type, emission standard, payload, and engine parameters. Although these data do not usually provide direct emission observations, they play an important role in explaining why similar vehicles may exhibit different emission levels under different conditions. They are therefore often used as correction inputs in emission estimation and as a basis for stratified analysis. Overall, the data system for dynamic truck emission estimation is not a simple accumulation of information, but an integrated structure composed of multiple information layers with different functions. To clarify the roles of these data sources,
Table 4 summarizes the main data types and their functions in dynamic truck emission estimation.
Atmospheric and weather conditions also deserve explicit attention in dynamic estimation. Rain, fog, low temperature, high temperature, humidity, wind, and wet road surfaces may affect emissions indirectly by changing vehicle velocity profiles, acceleration behavior, rolling resistance, aerodynamic load, congestion patterns, and driver responses. For diesel trucks equipped with selective catalytic reduction systems, ambient temperature can also influence after-treatment effectiveness and NO
X control, while adverse weather may increase uncertainty in GPS positioning and traffic-state recognition [
29]. Therefore, weather variables should be treated not only as auxiliary descriptors, but also as potential correction factors or stratification variables when comparing emission levels across periods and regions.
3.2. Data Fusion
Multi-source data fusion is not a simple aggregation of heterogeneous data, but the construction of an observation framework with temporal consistency, spatial correspondence, and semantic alignment around the same emission process. In dynamic truck emission estimation, vehicle status monitoring data are used to characterize instantaneous operating conditions and emission responses; activity trajectory and traffic operation data are used to reconstruct vehicle movement within the road network; and environmental and vehicle attribute data provide contextual and technical information. Only when these data are integrated within a unified analytical framework can emission estimation move beyond static judgment based on partial states toward a dynamic representation of real-world operating processes. In this sense, the value of multi-source fusion lies not in adding more data sources, but in establishing a mutually verifiable chain of process information for the same vehicle, road segment, and time period.
Methodologically, multi-source data fusion usually involves three basic steps: data preprocessing, spatiotemporal alignment, and fusion-based representation. Data preprocessing addresses anomalies, missing values, noise, and temporal drift in order to improve the usability and comparability of different data sources. Spatiotemporal alignment focuses on consistent matching across road location, operating stage, and time sequence, and typically includes map matching, road segment linking, stop identification, and attribute association. Fusion-based representation then integrates trajectories, road networks, traffic states, monitoring signals, and vehicle attributes into a unified input for dynamic inventory development, high-emitter identification, hotspot analysis, and policy evaluation. Overall, the field has evolved from single-source correction to multi-source joint identification, and from fixed-scale aggregation to multi-resolution representation.
In this process, denoising refers to reducing random noise, drift, spikes, and discontinuities in the signals used for estimation. The most disturbed signals are usually GPS position and elevation, velocity and acceleration derived from trajectories, OBD/OBM sensor streams, NOX sensor outputs, fuel-consumption signals, and PEMS measurements. Common treatments include range checks, outlier removal, moving-window smoothing, low-pass filtering, Kalman filtering, Savitzky–Golay smoothing, and frequency-domain inspection when periodic or high-frequency noise is suspected. The choice of denoising method should be reported because excessive smoothing may suppress transient acceleration and stop-and-go patterns that are important for emission estimation.
It should be noted that multi-source fusion does not necessarily improve estimation quality. Different data sources often differ in sampling frequency, spatial accuracy, semantic definition, and observation scope. Without effective alignment and correction, fusion may instead introduce new pathways of error propagation. For example, trajectory drift may affect road segment assignment, temporal mismatch may weaken the correspondence between operating conditions and emission responses, and scale-conversion bias may distort information when linking microscopic observations to regional analysis. These issues are especially pronounced in port access corridors, congested road sections, graded networks, and complex intersections. This suggests that the key issue in multi-source fusion is no longer simply whether data can be integrated, but whether a stable, interpretable, and comparable fusion framework can be established to support subsequent model development and environmental management applications.
GPS sampling frequency is another practical issue for dynamic emission estimation. A frequency of 1 Hz is commonly used as a basic resolution for synchronizing GPS, OBD, and PEMS data and for constructing second-by-second velocity, acceleration, and VSP features [
30]. This resolution is usually acceptable for route-level activity reconstruction and many second-by-second emission analyses. However, when the research objective is to capture rapid acceleration, harsh braking, stop-and-go operation, grade-related power changes, or short congestion episodes, 5–10 Hz data can better preserve transient dynamics. In contrast, lower-frequency GPS records are more suitable for macro-level activity allocation than for fine-grained instantaneous emission estimation. Regardless of frequency, synchronization and denoising are necessary before deriving acceleration or VSP.
3.3. Feature Extraction
Key feature extraction is the core step linking multi-source observations to emission estimation models. Its purpose is not simply to include more variables, but to identify effective information that can represent the emission formation process from complex observations. In the existing literature, these features can generally be grouped into three categories. The first includes kinematic features, such as velocity, acceleration, idling time, and velocity fluctuation, which are mainly used to describe vehicle operating states. The second includes dynamic features, such as payload, road grade, engine operating conditions, exhaust temperature, specific power, and power demand, which are used to capture load variation and its emission response. The third includes spatiotemporal contextual features, such as road type, congestion level, functional zone attributes, and key corridors, which link emission results to specific road environments and governance scenarios. Thus, the value of a feature system lies not in the number of variables it contains, but in whether it can achieve an effective balance among observability, mechanism relevance, and scenario suitability.
From the perspective of research development, the feature system for truck emission estimation is shifting from surface-level state variables toward mechanism-oriented explanatory variables. Basic features such as velocity alone are often insufficient to explain emission fluctuations under heavy loads, uphill driving, congestion, and frequent stop-and-go conditions [
31,
32,
33,
34,
35,
36,
37,
38,
39]. As a result, increasing attention has been given to variables such as payload, road grade, engine parameters, exhaust temperature, and specific power. At the same time, the focus of research has moved beyond the effect of individual variables to the analysis of their interactions, such as the combined influence of velocity and payload, the linkage between road grade and throttle opening, and the response relationship between engine load and after-treatment status. This suggests that emission formation is inherently shaped by multiple interacting factors, and that no single variable is usually sufficient to explain emission differences under complex operating conditions.
In addition, contextual variables such as road class, functional zone attributes, port access corridors, and key freight corridors are receiving increasing attention. Although these factors do not directly determine the mechanism of emission formation, they can substantially improve the spatial interpretation of emission results and strengthen their relevance for governance. In this way, model outputs can move beyond prediction alone and become more useful for identifying and locating management targets. Accordingly, the purpose of key feature extraction is no longer limited to improving model accuracy, but increasingly extends to mechanism identification, abnormal vehicle detection, and governance unit identification. To further summarize the main steps and challenges involved in multi-source data fusion and key feature extraction for dynamic truck emission estimation,
Table 5 presents an overview of the relevant content.
As shown in
Table 5, multi-source data fusion and key feature extraction provide an input foundation that is closer to real-world operating processes, but also introduce new sources of error and challenges for interpretation. This implies that subsequent model development is not only a matter of predictive accuracy, but also closely tied to the stability of input information and its relevance to emission formation mechanisms.
4. Physics-Constrained Emission Modeling
In this review, truck emissions are understood in a broad sense, covering greenhouse gases such as CO2, air pollutants such as NOX and particulate matter, and closely related operating indicators such as fuel consumption and energy use. Accordingly, physics-constrained emission modeling is discussed not only for carbon emission estimation but also as a general modeling logic for improving the credibility of dynamic truck emission estimation.
Building on multi-source data support and mechanism-related feature extraction, physics-constrained emission modeling refers to modeling approaches in which emission prediction is not only learned from empirical data, but is also restricted by basic physical relationships among vehicle motion, tractive power demand, fuel or energy consumption, after-treatment behavior, and emission generation. Unlike conventional statistical correction factors, which usually adjust model outputs or emission factors after fitting, physics constraints can be introduced before, during, or after model training through input construction, model structure, loss functions, parameter bounds, or post-estimation feasibility checks. Unlike general hybrid models that simply combine mechanistic variables with data-driven predictors, physics-constrained models explicitly restrict the feasible solution space so that estimated emissions remain consistent with vehicle dynamics, energy balance, operating limits, and pollutant formation mechanisms.
Existing studies suggest that conventional statistical models and data-driven models have, respectively, improved engineering applicability and high-dimensional predictive capacity. However, both still rely, to varying degrees, on empirical parameters or sample-dependent correlations, and their representation of key constraints such as payload, road grade, power demand, energy consumption, after-treatment state, and operating limits remains relatively limited. Against this background, the value of physics-constrained modeling lies not in replacing data-driven learning with purely physical models but in incorporating fundamental mechanistic relationships into model development and result interpretation, thereby improving the credibility, stability, and interpretability of estimation results under complex operating conditions. To avoid treating this process as a simple linear shift in methodology,
Figure 3 summarizes the logical relationship among unconstrained or weakly constrained modeling, the embedding of physical constraints, and credible estimation.
A simple way to illustrate the technical meaning of physical constraints is to start from the relationship between vehicle motion and tractive power demand. For a truck at time t, the tractive power demand can be expressed conceptually as follows:
where
denotes tractive power demand,
is vehicle mass including payload,
is acceleration,
is gravitational acceleration,
is the road grade angle,
is rolling resistance coefficient,
is air density,
is the aerodynamic drag coefficient,
is the frontal area, and
is vehicle velocity. Vehicle-specific power can then be represented as follows:
These relationships show why payload, acceleration, road grade, and velocity should not be treated only as empirical predictors, but also as mechanism-related variables that constrain physically plausible emission responses. In hybrid AI–physics modeling, such constraints can be further introduced through a composite objective function:
where
measures the prediction error against observed emissions or fuel/energy consumption,
penalizes violations of energy–emission consistency or power–demand relationships, and
penalizes physically implausible outputs such as negative emissions, unrealistic fuel-use responses, or emission changes that contradict comparable power-demand conditions. These formulations are not intended to define a single universal model, but to show how physical relationships can be embedded into feature construction, model training, and output validation.
The integration between machine learning and physical constraints can be implemented through several pathways. The first is physics-informed feature construction, in which variables such as VSP, tractive power demand, payload-adjusted power demand, road grade, and engine operating state are used as mechanism-related inputs rather than purely statistical covariates. The second is hybrid mechanistic-data-driven modeling, in which a conventional physical or semi-empirical model provides baseline estimates, while machine learning is used to learn residual errors, nonlinear corrections, or context-specific deviations. The third is physics-constrained learning, in which physical consistency terms, monotonicity requirements, parameter bounds, or feasibility constraints are incorporated into the loss function or optimization process. In this sense, physics-informed neural networks and related constrained learning frameworks are not only used to improve prediction accuracy, but also to restrict model behavior under operating conditions that are poorly represented in the training data. For truck emission prediction, this is particularly relevant when models are transferred across payload levels, road grades, congestion states, vehicle technologies, or weather conditions.
As shown in
Figure 3, physics-constrained modeling should not be viewed as a simple replacement for conventional statistical or data-driven models. Rather, it builds on the recognition of the limitations of unconstrained or weakly constrained approaches and supports a shift in truck emission estimation from high-dimensional fitting toward more credible estimation. This shift can be understood through three levels of constraint integration. At the parameter level, variables such as payload, road grade, VSP, and engine operating state are introduced to correct the averaged representation used by conventional emission factors or average-velocity models, thereby reducing bias under heterogeneous operating conditions. At the structure level, relationships among tractive power demand, fuel or energy consumption, after-treatment state, and emission response are incorporated into the model architecture or feature-interaction structure, which improves process consistency and interpretability. At the output level, physical bounds and feasibility checks are used to avoid implausible results, such as negative emissions, unrealistic fuel-use responses, or emission changes that contradict comparable power-demand conditions. Therefore, the role of physical constraints is not only to improve local prediction accuracy, but also to restrict the solution space to physically credible regions and enhance model reliability under cross-scenario applications.
The practical value of physics-constrained modeling is particularly evident in real-world operating scenarios where emission responses are strongly affected by physical operating conditions. Under steep grades, road slope changes tractive power demand and may lead to emission changes that cannot be adequately represented by average velocity alone. Under heavy payloads, the same velocity profile may correspond to substantially different engine loads, fuel or energy use, and emission responses. Under congestion and stop-and-go operation, frequent acceleration, idling, and low-velocity driving increase the importance of transient power demand and after-treatment behavior. Under adverse or extreme weather conditions, changes in rolling resistance, aerodynamic load, traffic state, and after-treatment temperature further increase estimation uncertainty. In these scenarios, purely data-driven models may achieve good local fitting performance but can become unstable when applied to operating combinations that are poorly represented in the training data. By incorporating variables such as payload, road grade, VSP, tractive power demand, fuel or energy use, and physical output bounds, physics-constrained or hybrid models can provide more interpretable and transferable estimates for dynamic emission management, especially in cross-scenario policy comparison and high-emission vehicle screening.
4.1. Modeling Pathways
In truck emission estimation, conventional statistical models and data-driven models represent the two main baseline modeling approaches. Existing reviews suggest that road vehicle emission modeling has evolved from traditional methods, such as emission factor approaches, average-velocity models, and modal models, toward data-driven and hybrid approaches [
40,
41,
42]. Although these two approaches differ in data sources, parameter organization, and application tasks, both rely mainly on empirical parameters or sample-based correlations to estimate emissions. Their representation of dynamic mechanisms, operating limits, and physical consistency remains relatively limited. They can therefore be regarded as unconstrained or weakly constrained modeling pathways.
Conventional statistical models have long provided the basic framework for truck emissions accounting. Studies have shown that COPERT, MOVES, and IVE, together with average-velocity functions and emission factor schemes, still play an important role in regional inventory development, aggregate accounting, and policy scenario analysis [
43,
44,
45,
46]. Their main strengths lie in consistent accounting rules and practical applicability. However, they essentially provide an averaged representation of a complex emission formation process. For heavy-duty trucks, low-velocity congestion, frequent stop-and-go operation, road grade variation, payload changes, and fluctuations in after-treatment performance may all substantially alter emission generation. Under such conditions, average velocity, typical operating modes, or static emission factors alone are often insufficient to capture transient responses and local variation [
47,
48]. The limitation of conventional statistical models, therefore, is not a lack of practical value, but that they are better suited to macro-level accounting than to fine-grained interpretation under complex operating conditions.
With the continued growth of onboard monitoring, trajectory, and traffic operation data, data-driven models have become an important extension of truck emission estimations. Recent studies have applied deep learning and machine learning methods to predict truck CO
2 emissions, NO
X emissions, and fuel consumption, showing strong capacity to capture complex nonlinear relationships [
49,
50,
51]. Methods such as LightGBM, gradient boosting, and temporal deep learning based on PEMS and OBD data have further promoted a shift from empirical statistics to high-dimensional dynamic prediction [
52,
53,
54]. More recent work has also extended the analytical scope from the vehicle level to network- and regional-level representation [
55,
56,
57]. However, the main strength of these models lies in statistical fitting rather than mechanism representation. Without explicit constraints on key physical processes and operating limits, their performance may weaken under extreme conditions, data-sparse settings, cross-regional transfer, or scenario changes, leading to reduced generalizability, unstable interpretation, and results that may deviate from basic physical relationships [
58]. This suggests that the key issue in truck emission modeling is no longer only predictive accuracy, but also whether physical consistency and interpretive stability can be maintained under complex and unseen conditions. This, in turn, motivates the development of physics-constrained modeling.
Table 6 shows that model quality should not be assessed by predictive error alone. For governance-oriented emission estimation, interpretability, data demand, spatial–temporal resolution, transferability, uncertainty, and physical consistency are equally important. Therefore, model selection should depend on whether the task is inventory accounting, transient prediction, high-emitter identification, hotspot detection, or policy comparison.
4.2. Physical Constraints
Physics-constrained truck emission modeling does not simply mean adding a few mechanism-related variables to a model. Rather, it involves embedding the basic relationships among vehicle dynamics, energy use, and emission generation into the modeling process, so that emission responses are not determined solely by empirical parameters or sample-based correlations. The distortion often observed in unconstrained or weakly constrained models under complex operating conditions stems not only from limited inputs but also from the lack of necessary constraints on payload, road grade, traction demand, and operating limits. Existing studies combining conventional emission models with machine learning suggest that mechanism-based structures can provide a more stable basis for estimation under complex conditions [
59].
Physical constraints first appear as corrections to averaged representations. For heavy-duty trucks, emissions at the same velocity may differ substantially with changes in payload, grade, and driving state, which cannot be captured well by a single average parameter. More importantly, constraints can also be embedded directly into the model structure, so that both training and inference are shaped by data patterns and mechanistic relationships. Related studies suggest that when a model can jointly account for emissions, energy use, and control-related behavior, its results are often more stable and more consistent with real-world operating logic [
60]. For clarity,
Table 7 summarizes the main levels at which physical constraints can be incorporated into truck emission models and their respective roles.
Compared with parameter correction or structural constraints, output-level consistency more directly determines whether a model is useful in practice. The main challenge in truck emission modeling is often not whether an in-sample error is sufficiently small, but whether the model remains credible beyond the training sample. Under conditions such as heavy loads, long grades, congestion, low temperatures, or fluctuations in after-treatment performance, the model should at least preserve basic physical plausibility. When running resistance increases, emissions and energy use should not show responses that clearly contradict underlying mechanisms; when control conditions change, model outputs should also remain within realistic behavioral limits of the vehicle. Research on dynamic emission characteristics and control strategies suggests that emission variation is not merely a statistical pattern, but is jointly shaped by operating state, control behavior, and system boundaries [
61]. In this sense, the value of physics-constrained modeling lies not in adding theoretical decoration but in preventing emission estimation from becoming numerical fitting without credible bounds under complex conditions.
4.3. Hybrid Modeling Frameworks
Compared with conventional statistical models and purely data-driven models, the main advantage of physics-constrained models lies less in improving local fit and more in providing a stronger basis for interpretation and more credible boundaries. For trucks, whose emissions are jointly affected by payload, road grade, velocity variation, engine operation, and after-treatment status, emission changes are not simply data fluctuations, but the result of interacting power demand and control processes. By introducing physical constraints, the model no longer aims only at the best in-sample fit, but also maintains consistency between emission responses and the underlying operating mechanisms. For this reason, such models are generally less prone to unstable interpretation or unrealistic responses under complex conditions, previously unobserved operating scenarios, or changing contexts.
This is particularly important for governance-oriented emission estimation. Environmental management requires not only predictions that are numerically close to observations, but also results that can explain why emissions change, under what conditions they change, and whether such changes have stable practical meaning. If model outputs lack basic mechanistic support, they may perform well on local samples but still be difficult to use for high-emission vehicle identification, hotspot unit detection, or comparison of emission control strategies. By contrast, physics-constrained models retain basic restrictions linking operating load, energy use, and emission generation, and are therefore generally more suitable for interpretation, comparison, and extrapolation.
Physics-constrained models, however, should not be viewed as a substitute for all other approaches. Their current limitations are mainly threefold. First, key mechanism-related variables are often costly to obtain, and some constraint relationships are difficult to observe fully in real-world settings. Second, parameter organization and model construction are more complex, and recalibration is often still needed across regions and fleet conditions. Third, a unified and mature modeling framework has not yet emerged, which limits comparability and transferability across methods. Their value, therefore, lies less in replacing other models than in offering clearer potential for tasks that place higher demands on result credibility, particularly under complex operating conditions, cross-scenario application, and governance-oriented analysis.
5. Governance Applications
The value of dynamic truck emission estimation lies less in serving environmental management in a broad sense than in supporting several key tasks in truck emission governance. Compared with static accounting based on aggregate statistics, its significance is not simply that it provides finer estimates, but that it transforms emission results into governance targets that can be identified, located, compared, and acted upon. Existing studies suggest that its more direct applications are concentrated in three areas closely related to truck emission governance: high-emission vehicle identification, hotspot unit detection, and comparison of governance strategies.
For clarity, a high-emission vehicle in this review refers to a truck with emission levels that exceed a regulatory threshold, an inspection threshold, or a data-driven fleet benchmark for a specific pollutant and operating condition. The criterion may therefore be based on absolute emission limits, remote-sensing thresholds, OBD/OBM-based abnormality indicators, or high-percentile values within a comparable fleet. Hotspot identification refers to the detection of road segments, corridors, time windows, logistics nodes, or regional units where emission intensity is persistently or significantly higher than the surrounding background. These definitions are important because dynamic estimation should not only predict emissions, but also identify actionable governance targets.
From the perspective of decision support, current research has gradually developed three levels of capability. The first is identification-oriented support, which relies on real-world monitoring and abnormality screening to detect high-emission vehicles. The second is location-oriented support, which uses dynamic inventories and spatiotemporal analysis to identify key road segments, critical time periods, and priority regional units. The third is comparison-oriented support, which evaluates the environmental effects and applicable conditions of different governance pathways through policy comparison and transport optimization. These three capabilities correspond to three core governance questions: which vehicles should be prioritized for regulation, where problems are concentrated, and which governance pathway is more effective. Together, they form the main application framework through which dynamic truck emission estimation can support governance decisions.
Figure 4 summarizes this application chain from estimation results to governance support.
As shown in
Figure 4, the main applications of dynamic truck emission estimation extend from high-emission vehicle identification, dynamic inventories, and hotspot unit detection to the comparison of emission control policies and transport optimization. The following sections discuss these applications from three levels of support: identification-oriented, location-oriented, and comparison-oriented.
5.1. High-Emitter Identification
High-emission vehicle identification is one of the most direct regulatory applications of dynamic truck emission estimation and a clear example of its identification-oriented support function. Unlike conventional roadside inspection, which relies mainly on fixed-point sampling and manual checks, non-intrusive regulation emphasizes the continuous collection of emission and operating data under real-world driving conditions and uses these data to identify abnormal vehicles that contribute disproportionately to total emissions within large fleets. For heavy-duty trucks, given their strong interregional mobility, complex operating conditions, and the sensitivity of after-treatment performance to load and road conditions, the main challenge is no longer whether emissions can be monitored, but whether abnormal emissions can be reliably distinguished from normal variation under complex conditions.
The threshold used to define a high-emission vehicle should be interpreted in relation to the measurement method and governance purpose. For regulatory compliance, the threshold is usually linked to emission standards or inspection criteria for pollutants such as NOX, particulate matter, CO, hydrocarbons, or CO2. For fleet screening and non-intrusive regulation, relative thresholds such as abnormal deviations from comparable vehicles or upper-tail percentiles may be more practical, especially when operating conditions vary strongly across routes. This means that high-emission status should be treated as pollutant-specific, method-specific, and context-dependent rather than as a single universal label.
Existing research suggests three main pathways for non-intrusive identification of high-emission vehicles. The first relies on continuous monitoring based on OBD/OBM data or remote onboard systems, using NO
X, fuel consumption, and related operating signals to detect vehicles that persistently deviate from normal levels. The second uses roadside remote sensing or remote emission detection for rapid screening. Its main advantage is broad coverage and flexible deployment, allowing potentially high-emission vehicles to be identified quickly from large traffic streams. The third combines multiple data sources, such as OBD/OBM data, remote sensing, PEMS, and roadside monitoring, to improve identification reliability through joint screening and result verification [
62,
63].
The governance value of this application lies not only in expanding regulatory coverage but also in improving the allocation efficiency of limited enforcement resources. Emissions from heavy-duty truck fleets are rarely distributed evenly across all vehicles; instead, a small number of high emitters often account for a disproportionate share of total NOX or particulate emissions. In this sense, high-emission vehicle identification reflects the direct regulatory value of dynamic estimation more clearly than average emission assessment. Even so, this approach remains affected by monitoring error, sensitivity to operating conditions, threshold setting, and regional differences in regulatory standards. Further work is therefore needed to improve consistency across monitoring methods and strengthen the reliability of dynamic estimation for non-intrusive regulation.
5.2. Hotspot and Inventory Development
Dynamic inventory development and hotspot identification are central to the location-oriented support function of dynamic truck emission estimation. Their value lies not simply in improving spatiotemporal resolution but in translating governance targets into more operational spatial units, such as road corridors, port access routes, urban fringe logistics nodes, and functional zones. Compared with static allocation methods based on fleet size and average mileage, dynamic inventories place greater emphasis on reconstructing truck activity levels and emission distributions across different time periods, road segments, and regional units using real-world trajectories, road networks, and vehicle attributes. In this way, emission governance can move beyond administrative statistical units toward the scale of actual transport operations.
Operationally, a hotspot may be identified by absolute emission density, relative concentration compared with surrounding areas, persistence across time windows, or the co-occurrence of high truck activity and high emission intensity. For truck governance, hotspots are often more useful when they are linked to manageable units such as expressway gateways, port drayage corridors, logistics parks, urban freight entrances, or uphill road segments. This unit-based interpretation helps avoid treating hotspot detection as a purely cartographic result.
Existing studies suggest that high-resolution emission inventories based on GPS trajectories, road network attributes, and vehicle information can clearly identify differences in the temporal distribution and spatial concentration of local and transit trucks, while also revealing emission hotspots along major expressway corridors, port access routes, and key functional zones [
64,
65]. These findings indicate that truck emissions are not evenly distributed, but tend to cluster along specific road segments, time periods, and freight corridors. Accordingly, the focus of governance is not uniform control across the entire region, but the identification of priority corridors, critical periods, and key spatial units for a more targeted and differentiated intervention.
More broadly, the value of dynamic inventories also lies in their ability to provide spatial evidence for corridor control, zonal governance, and interregional coordination. On the one hand, they help identify key emission units such as port access routes, expressway gateways, and logistics hubs, thereby supporting time-specific restrictions, route adjustment, and coordinated regional control. On the other hand, they can capture changes in emission patterns before and after policy implementation, which is useful for a more targeted evaluation of governance effects. Even so, this line of research still faces several challenges, including incomplete trajectory data, map-matching error, unstable activity reconstruction, and inconsistent parameter systems across regions. Future work needs to improve comparability across regions, data sources, and policy scenarios while maintaining inventory accuracy.
5.3. Policy Comparison and Optimization
The comparison of emission control policies and transport optimization represents an important extension of dynamic truck emission estimation from the description of results to decision support, and a concentrated expression of its comparison-oriented support function. Unlike static accounting, which mainly provides aggregate judgments, dynamic estimation can compare the emission reduction effects of different control measures and transport organization strategies under conditions that are closer to real-world operation. This provides a more scenario-specific basis for time-based restrictions, low-emission zone design, fleet renewal, alternative fuel adoption, and transport reorganization. The key issue is not simply whether a measure is effective in general, but how its relative effect and applicability vary across specific operating conditions.
Existing research has mainly proceeded along two lines. One focuses on the comparison of emission control policies at the regional and road-network levels, including low-emission zones, zero-emission zones, restrictions on key road segments, and coordinated control in priority areas. Related studies suggest that dynamic inventories and model-based estimates can be used to compare the emission reduction effects of different measures and to identify the time windows and spatial units to which policy impacts are most sensitive [
66,
67]. This indicates that policy effectiveness depends less on the policy label itself than on how well it matches road conditions, temporal patterns, and vehicle operating behavior. The other line of research concerns the optimization of transport organization and technical pathways, including alternative fuel adoption, fleet upgrading, vehicle-velocity management, eco-driving, and route optimization. Existing studies further suggest that the emission reduction benefits of electrification, long-haul transport reorganization, and regional control measures are often highly sensitive to operating conditions, regional energy structure, road grade, and transport organization, meaning that no uniformly optimal solution exists independent of context [
68,
69,
70].
Evidence on low-emission zones (LEZs) generally suggests that they can reduce traffic-related pollutants such as NO
2/NO
X and particulate matter, but the magnitude of improvement differs across cities, vehicle-restriction rules, enforcement intensity, fleet composition, baseline pollution, and spillover effects [
71]. Therefore, dynamic emission estimation can contribute by comparing pre- and post-policy changes at consistent spatial and temporal units rather than by assuming that all LEZ designs have the same effect. This is also relevant for objective interregional comparison: regions should be compared using normalized indicators such as emission intensity per vehicle-kilometer, per tonne-kilometer, per corridor, or per functional zone, together with harmonized vehicle classes, weather strata, road attributes, and inventory boundaries.
The increasing deployment of electric trucks further changes the boundary of dynamic emission estimation. Electric trucks have no tailpipe exhaust during operation, but they do not eliminate all emission-related concerns. Non-exhaust particles from tire and road wear, brake wear, and resuspension may remain relevant, and greenhouse gas accounting depends on the electricity-generation mix. If electricity is produced mainly from coal or other high-carbon sources, part of the emission burden is shifted from the road segment to the power-generation stage. Consequently, future truck emission estimation should distinguish tailpipe, non-exhaust, well-to-wheel, and life-cycle boundaries and should incorporate regional grid emission factors when comparing diesel, hybrid, and electric freight systems [
72].
The value of this application lies in enabling dynamic estimation to move beyond describing current conditions and to explain why different control measures produce different effects across regions, time periods, and transport settings. For this reason, the comparison of emission control policies and transport optimization should not be viewed as a secondary use of estimation results, but as an important pathway through which dynamic estimation contributes to governance decisions. Even so, this line of work is still affected by differences in accounting boundaries, parameter localization, scenario design, and multi-objective constraints, which limit the comparability of results across regions. Future research needs to improve the consistency of scenario settings and the realism of operating constraints in order to strengthen the practical value of dynamic estimation for emission control policy design and transport optimization.
To present more clearly the main support functions of dynamic truck emission estimation across different governance tasks,
Table 8 summarizes its key application scenarios, main methods, and decision-support roles.
As shown in
Table 8, the application of dynamic truck emission estimation is no longer limited to presenting emission results, but is increasingly extending to governance-oriented tasks such as high-emission vehicle identification, priority unit detection, and comparison of emission control strategies. This suggests that the focus of research in this field is gradually shifting from emission accounting itself toward decision support for specific governance tasks.
6. Conclusions and Future Directions
6.1. Conclusions
This paper reviews recent progress in dynamic truck emission estimation from four perspectives: multi-source data support, key feature extraction, physics-constrained truck emission modeling, and key governance-oriented application scenarios. Existing studies suggest that truck emission estimation is shifting from macro-level accounting based on average activity levels and static emission factors toward dynamic estimation grounded in real-world operating processes. The continued accumulation of multi-source data provides an important basis for characterizing emission processes, while the feature system has expanded from general state variables to mechanism-related explanatory variables. On this basis, truck emission modeling is also moving beyond unconstrained or weakly constrained approaches toward frameworks that place greater emphasis on physical consistency and result credibility.
At the same time, the value of dynamic truck emission estimation is extending from emission accounting to governance support. Current studies have shown potential in high-emission vehicle identification, dynamic inventory development, and priority unit detection, as well as the comparison of emission control strategies and transport optimization. Even so, challenges remain across the full chain from data acquisition to result output and practical application, including limited data consistency, constrained model generalizability, and unstable translation into governance tasks. This suggests that the field is still moving from methodological exploration toward more standardized application.
6.2. Future Directions
Future research may be advanced in four main directions. First, greater attention should be given to standardized representation, quality control, and fusion consistency of multi-source data in order to improve the comparability and usability of inputs across different scenarios. Second, closer integration between data-driven methods and physical constraints is needed to enhance model interpretability, boundary plausibility, and stability of results under complex operating conditions, previously unobserved operating scenarios, and cross-regional applications. Here, previously unobserved operating scenarios refer to road, weather, payload, traffic, vehicle–technology, or policy combinations that are not represented in the model calibration or training data. Third, stronger links should be established between dynamic estimation results and specific governance tasks so that they can more effectively support high-emission vehicle regulation, priority unit identification, policy comparison, and transport optimization. Fourth, future studies should explicitly report uncertainty and error propagation across the full estimation chain.
Emission estimation errors may arise from OBD/OBM sensor bias, PEMS measurement uncertainty, GPS drift, map-matching errors, time-synchronization mismatches, missing payload information, emission-factor localization, model parameter uncertainty, and extrapolation beyond the observed data domain. These uncertainties can accumulate from data acquisition to feature construction, model estimation, inventory aggregation, and policy interpretation. For this reason, future work should report validation data, error metrics, uncertainty intervals, sensitivity analysis, and cross-region robustness tests whenever possible, rather than presenting dynamic emission estimates as deterministic outputs.
Overall, dynamic truck emission estimation has moved beyond single-method exploration toward a stage characterized by the joint development of data, models, and governance applications. Continued progress in data credibility, model robustness, and application-oriented interfaces may further strengthen its role in truck emission governance.