A Review of OBD-II-Based Machine Learning Applications for Sustainable, Efficient, Secure, and Safe Vehicle Driving
Abstract
1. Introduction
1.1. Contributions
- Overview of OBD-II and ML: A foundational overview of OBD-II system architecture and operational principles is provided, followed by an introduction to ML paradigms suitable for vehicular applications. Emphasis is placed on the synergy between these technologies in enabling intelligent automotive systems.
- Systematic Exploration of Key Application Domains: Findings from diverse case studies are discussed to illustrate practical applications of ML-driven OBD-II systems. In this context, a detailed review across critical domains where ML offers substantial advancements is conducted, including the following thematic axes: (i) fuel consumption and energy optimization; (ii) emission control and environmental impact; (iii) driver behavior profiling with personalized feedback; (iv) vehicle health monitoring and anomaly detection; (v) cybersecurity for in-vehicle networks; and (vi) intelligent road perception and driving support. These domains were identified through a systematic literature review across major academic databases such as ACM Digital Library, IEEE Xplore, PubMed, Web of Science, and others. Specifically, we focused on keywords related to ML and OBD-II technology, identifying thematic areas where ML applications are most impactful. Moreover, we selected articles written in English, primarily from peer-reviewed journals, published between 2021 and May 2025, and that directly explored ML-based applications in these areas. Studies that did not center on ML methods or that addressed unrelated OBD-II aspects were excluded from our review.
- Evaluation of ML Models for OBD-II Data: The performance of supervised, unsupervised, reinforcement learning (RL), deep learning (DL), and hybrid models is investigated to analyze their suitability, performance tradeoffs, and deployment feasibility in OBD-II environments. In this direction, the benefits and challenges of real-time OBD-II data processing are analyzed, emphasizing the use of low-latency ML algorithms for tasks such as predictive maintenance, eco-driving feedback, and anomaly detection.
- Lessons Learned, Challenges, Gaps, and Research Directions: The main lessons derived from the practical deployment of ML models in OBD-II systems are summarized. In addition, critical limitations are identified, including challenges and gaps in model generalization, scalability, computational efficiency, data privacy, and regulatory compliance, and prospective directions for future research are outlined.
1.2. Previous Review Papers
Reference | OBD-II-Centric | Energy Consumption | Emission Control & Environmental Impact | Automotive Behavior & Driver Analysis | Anomaly Detection & Cybersecurity | Road Perception & Driving Support |
---|---|---|---|---|---|---|
Shirole et al., 2025 [20] | ✗ | √ | ✗ | √ | ✗ | ✗ |
Mahale et al., 2025 [17] | ✗ | ✗ | ✗ | ✗ | √ | ✗ |
Cao et al., 2025 [19] | √ | √ | √ | ✗ | ✗ | ✗ |
Du et al., 2025 [23] | ✗ | ✗ | √ | ✗ | ✗ | ✗ |
Jain et al., 2023 [18] | √ | √ | ✗ | √ | ✗ | ✗ |
Malik et al., 2023 [24] | ✗ | ✗ | ✗ | √ | ✗ | ✗ |
This paper | √ | √ | √ | √ | √ | √ |
1.3. Structure
2. Overview of OBD Technology and ML
2.1. OBD-II System and Regulatory Framework
- Engine/powertrain: RPM (PID 0C), throttle (PID 11), coolant and intake air temperature (IAT) temperatures, manifold pressure.
- Fuel system: short/long fuel trims, fuel level, equivalence ratio.
- Emissions: O2 sensor outputs, catalyst temperatures.
- Vehicle dynamics: speed (PID 0D), wheel speeds, steering angle.
- Maintenance: DTCs, time since engine start.
2.2. ML for OBD-II-Based Vehicle Intelligence
3. Monitoring, Estimation, and Optimization of Energy Consumption
Reference | Objective | ML Models | Driving Environment | Type of Vehicle and Fuel | OBD or OBD-Derived Parameters | Performance Metrics |
---|---|---|---|---|---|---|
Yen et al., 2021 [15] | Prediction of fuel consumption and identification of abnormal driving behaviors | FFB, RNN, and Elman Neural Networks | Regular and steep mountain roads, varying road conditions | Not explicitly mentioned | Engine RPM, vehicle speed, engine load, throttle position, coolant temp, air flow rate | RMSE and correlation coefficient () |
Abediasl et al., 2024 [37] | Real-time estimation of fuel consumption | RF and ANNs | Urban and highway driving conditions; includes intersections, stop signs, pedestrian crossings | HEV, PHEV, turbocharged gasoline, gasoline V8 | Engine load, engine speed, intake MAP, throttle position, air-fuel equivalence ratio, engine coolant temperature | RMSE and NMAE |
Fan et al., 2024 [38] | Reducing fuel consumption, promoting eco-driving, and providing driver feedback and alerts | LSTM-Conv and ANNs | Regular and mountain roads, traffic, varied terrains | Diesel-powered HDTs | Engine speed, vehicle speed, throttle position, fuel factor, and engine load | MAE, MAPE, RMSE, and R2 score |
Hu et al., 2024 [39] | Monitoring vehicle conditions, ensuring safety, and optimizing fuel efficiency | LSTM neural network | Regular operational conditions | Diesel-powered HDTs | Vehicle speed, engine speed, air intake volume, fuel flow, and accelerator pedal depth | MAPE, MAE, MSE, and R2 score |
Kabir et al., 2023 [40] | Monitoring and estimation of fuel consumption in urban traffic | LSTM neural networks | Signalized intersections with traffic states (queued, free-flowing, acceleration, deceleration) | Not explicitly mentioned | Speed and FCR | RMSE, MAE, and SMAPE |
Rykala et al., 2023 [41] | Cost-effective, real-time monitoring of fuel consumption using low-cost technology | Multivariate regression model, DT (CART algorithm), and ANNs | Varied terrain and speed limits | Gasoline-powered vehicle | Engine speed, vehicle speed, and engine load | MSE, MAE, and MRAE |
Eissa et al., 2023 [42] | Predicting the SOC and the RDR | Nonlinear SVR model with a RBF kernel | Rural driving conditions | EV | Battery current, battery voltage, battery temperature, ambient temperature, EV speed, SOC | MAE and R2 score |
4. Emission Control and Environmental Impact
4.1. Identification, Monitoring and Prediction of NOx Emissions
Reference | Objective | ML Models | Driving Environment | Type of Vehicle | OBD or OBD-Derived Parameters | Performance Metrics |
---|---|---|---|---|---|---|
Xu et al., 2024 [46] | Accurate calculation of NOx emission factor on the OBD online monitoring dataset using the COPERT model | ResNet, CBAM, SVR, ANN, CNN | Complex conditions influenced by driving behavior and external environment | HDDV (WEICHAI WP12.375E51 engine) | Engine speed, downstream NOx, atmospheric pressure, exhaust gas flow rate, urea tank temp, gas pedal opening, FCR | MAE, MAPE, RMSE |
Ge et al., 2023 [48] | Identification of malfunctioning or tampered SCR systems | RF, LR | Urban, rural, and motorway segments with speed thresholds | 18-ton trucks, 4.5-ton pickups, 19-seat bus (diesel engines) | Engine speed, vehicle speed, exhaust flow, NOx-specific emissions, coolant temperature | Total accuracy, recall, precision, null accuracy |
He et al., 2022 [49] | Timely identification and prediction of excessive NOx emissions | PPCA, kNN, NMF, RF, SVM, GBDT | Variable driving conditions | Diesel-powered vans, trucks, semi-tractors | Speed, atmospheric pressure, torque, fuel flow, coolant temp, NOx sensor outputs | R2, MAPE, RMSE, precision, recall, F1-score |
Li et al., 2024 [50] | Analysis of conformity of NOx and PN emissions, and reliability of OBD-derived data | Statistical techniques | Long distance, heavy-duty logistics delivery, freeway, urban, suburban roads | N3 HDDVs (China-VI compliant) | NOx concentrations, speed, engine speed, fuel flow rate, engine power, torque, ambient temp, barometric pressure | Statistical evaluation tests |
Liu et al., 2024 [51] | Correction of NOx concentration during DPP in NOx sensors | RF, MLP, LSTM | Urban roads, rural roads, and motorways | N3 and M3 HDDVs | Speed, barometric pressure, engine torque, friction torque, SCR NOx sensor outputs, DPF pressure, reactant allowance, mileage | R2, RMSE, MAE |
Xu et al., 2021 [52] | Prediction of NOx emissions considering road conditions, driving behavior and dynamics | Temporal fusion transformer-GRU, BPNN | Actual road operating conditions | Diesel vehicle | Torque percentage, water temp, fuel temp, oil temp, downstream oxygen percentage, urea tank level, vehicle speed, gas pedal opening, downstream NOx | MAE, RMSE |
Yang et al., 2024 [53] | Prediction of NOx and PN emissions via soft sensor monitoring | GA-GRU, LSTM, SVM, BPNN, SGD, GBDT | Urban, suburban and highway areas (20%, 25%, 55%) | Various HDTs | Coolant temp, vehicle speed, MAF, oil temp, fuel rate, catalyst temp, DPF delta/output pressure | R2 |
Zhao et al., 2024 [54] | Surveillance and prediction of NOx emissions under various driving conditions | Seq2Seq neural network (Bi-GRU, attention mechanism, ITL) | Highways outside city centers; acceleration, deceleration, cruise, idle states | HDDV | Instantaneous speed, acceleration, cooling water temp, throttle pedal degree | MAPE, RMSE, MAE |
4.2. Monitoring and Prediction of CO2 Emissions
Reference | Objective | ML Models | Driving Environment | Type of Vehicle | OBD or OBD-Derived Parameters | Performance Metrics |
---|---|---|---|---|---|---|
Andrade et al., 2024 [56] | Estimation of CO2 emissions and comparison of emissions from gasoline and ethanol | Linear regression, supervised/unsupervised learning | The route was approximately 13 km in urban areas with paved and asphalt sections | Flexible hybrid vehicle (gasoline, ethanol), Nissan Kicks | MAP, MAF for calculation of air-fuel ratio, speed and acceleration | MAE, RMSE |
Madziel et al., 2023 [57] | Development of a computational model for CO2 emissions | Linear regression, RF, gradient boosting | Characterization by segments meditating on urban, rural and highway environments | Euro 6-compliant vehicle equipped with a diesel engine | Velocity, vehicle acceleration, fuel consumption, CO2 emission, humidity, air temperature, latitude, longitude, altitude | MSE, R2 |
Moon et al., 2024 [58] | Prognostic model for trip-based CO2 emissions using observed data | XGBoost | Routes reflected real-road conditions and included congestion segments | 3.5-ton HDDV and 25-ton UHDV | Engine speed, throttle position, derived engine torque, engine coolant temperature, fuel rate, vehicle speed | R2, RMSE, MAPE |
Singh et al., 2023 [59] | Accurate CO2 estimation | DNN, deep CNN, LSTM network | Naturalistic experiments and controlled experiments | Petrol-powered Renault and Hyundai vehicles | Speed, engine RPM, mileage, fuel flow, throttle, acceleration | MAE, MSE, RMSE |
4.3. Monitoring and Prediction of Multiple Pollutants and Other Emissions
Reference | Objective | ML Models | Driving Environment | Type of Vehicle | OBD or OBD-Derived Parameters | Performance Metrics |
---|---|---|---|---|---|---|
Li et al., 2024 [60] | Prediction of NOx and CO2 emission | LSTM network, back-propagation neural network, PSO | Complex real-time driving conditions in several cities | HDDVs with conventional internal combustion engines | Instantaneous concentration of NOx in the exhaust gas, fuel flow rate, friction torque, DPF pressure difference, cumulative mileage, reagent residual quantity, SCR inlet/outlet temperature | MSE |
Xie et al., 2024 [61] | Analysis of the attributes of high NOx and CO2 emission in connection to the vehicle states of running and engine operation | K-means clustering, RF, XGBoost, elastic net | Uphill, flat, and downhill road segments | HDDV with an engine equipped with EGR valves | Engine output NOx, tailpipe NOx, engine speed, engine output power, accelerator pedal opening, air fuel ratio, rotstion speed, accelerator pedal opening | MSE |
Wang et al., 2024 [62] | Reduction of carbon emissions | Gaussian mixture along with big data mining | urban areas, freeways, and construction sites | Cargo trucks, dump trucks, tractor trucks all with diesel engines | Velocity, fuel flow rate, latitude, longitude | No need for evaluation since an operation analysis was performed |
Wang et al., 2024 [63] | Assessment and prediction of instantaneous and total BC emissions | RF, GBDT, XGBoost, LGBM, MLR | Different urban road types including branch roads, sub-arterial ways, arterial, express roads and freeways | Light-duty vehicles with gasoline direct or port injection engines | Fuel consumption, engine RPM, engine load, throttle position, engine speed, vehicle speed | RMSE, MAE, R2 |
Rivera-Campoverde et al., 2024 [64] | Estimation of pollutant emissions (CO2, CO, NOx, and HC) | K-means clustering, classification trees, RF, ANNs | Urban, rural, and highway driving conditions | Light vehicles with gasoline engines: Sedan (1.4 L), SUV (2.0 L), Pickup (2.4 L) | Engine speed, vehicle speed, fuel flow, pollutant data (CO2, CO, NOx, HC) | R2, MSE |
Xi et al., 2021 [65] | Prediction of NO, NO2, CO, CO2, and THC in urban areas | ARIMA model, SVR, dual-stage attention-based RNN, parallel attention-based LSTM network | Actual road conditions and bench condition in controlled experiment | Diesel vehicle of type FAW-Medium Duty Truck | Vehicle speed, engine power, throttle voltage, RPM, load, wheel force, oil temperature and ambient temperature | RMSE, MAE, MAPE, R2 |
Madziel et al., 2024 [66] | Analysis of emissions (, CO, THC, and NOx) under cold and warm engine conditions | Polynomial regression, RF, gradient boosting, SVM, and spectral clustering | A 40 km test route covering urban, expressway, and motorway segments | Euro 2 class car with a gasoline-powered internal combustion engine | Speed, acceleration, and emissions of CO2, CO, THC, and NOx | R2, MSE |
5. Automotive Behavior and Driver Analysis
5.1. Safety Monitoring
Reference | Objective | ML Models | Driving Environment | Type of Vehicle | OBD or OBD-Derived Parameters | Performance Metrics |
---|---|---|---|---|---|---|
Arzhmand et al., 2023 [67] | Enhancement of real-time safety aspects, understanding street-crossing pedestrian context | Residual Neural Networks, conditional statements | 6 h of video from traffic scenes in Toronto, recorded by dashboard camera (based on PIE dataset) | Not mentioned | Speed | Accuracy, precision, area under the curve (AUC), recall, F1 score |
Koley et al., 2022 [68] | Estimation of vehicular crash severity | DT, RF classifier, XGBoost, LR | Based on UK driving accident dataset involving vehicles (2005–2017), including mainly urban areas | Vehicles are described in terms of age and capacity | Speed for severity estimation (not used in this study) | Accuracy on the estimation of the severity of a class accident |
Mahariba et al., 2022 [69] | Accident detection | Naïve Bayes, DT, ANN, RNN, J48 | Data retrieved from PTWs, based on predefined crash scenarios (fall in a curve or slippery straight maneuver) performed by professional riders | Two-wheelers; professional riders; 97.2 cc engine with electronic ignition and double cradle frame | OBD unit with two accelerometers (one on the vehicle, one on the rider’s helmet) | Precision, recall, F-measure, Matthew’s correlation coefficient, ROC curve, precision–recall curve |
Ahmad et al., 2024 [71] | Detect and evaluate DDD | RNN (DDD-DL) | Custom dataset using OBD-II data converted into time series with timestamps; drowsiness identified via video signal | Not explicitly mentioned; in-car camera used | RPM, throttle position, steering torque, combined with in-car video | Accuracy, precision, recall, and F1-score across driver states |
Kumar et al., 2022 [72] | Incident detection and evidence integrity protection | Rule-based comparison with thresholds | Not mentioned | Not mentioned | Speed, RPM, torque, throttle position, relative throttle position, engine load | Correctly vs. erroneously identified incidents |
5.2. Driving Behavior Profiling
Reference | Objective | ML Models | Driving Environment | Type of Vehicle | OBD or OBD-Derived Parameters | Performance Metrics |
---|---|---|---|---|---|---|
Gace et al., 2023 [73] | Analyze eco-efficiency and driving behavior to promote sustainable transportation practices. | K-Means Clustering and Eco-Efficiency Analysis. | Primarily urban and suburban driving in Zagreb, Croatia. | Vehicles included gasoline, diesel, and hybrid powertrains. | Vehicle speed, RPM, accelerator pedal position, and engine load. | Eco-efficiency Index (Ieco), acceleration/deceleration thresholds, and idling percentages |
Lee et al., 2023 [74] | Driving behavior analysis, provision of feedback to the driver | LSTM networks, GRUs and adaptations (BiLSTM and BiGRU) | Six drivers using a Toyota Prius V2ZR-FXE | Hybrid and HEV | Battery output voltage, vehicle speed, calculated load, MAF, coolant temperature, throttle voltage, yaw, steering angle, lateral acceleration, forward and rearward acceleration, vehicle front/rear left/right wheel speed | Accuracy, recall, precision, F1-score, Kappa score as a statistical measure to evaluate the consistency between evaluators |
Song et al., 2023 [75] | Recognition of driving style and classify driving style (normal, aggressive, conservative) | Semi-supervised Gaussian mixture, kernel PCA (nonlinear mapping into six components) | Follow the precedent car and use time windows for recognition and prediction | Not mentioned explicitly | Velocity, acceleration, jerk (22 parameters) | Accuracy rate, macro precision rate, macro recall rate, and macro F1 comparing 4 classification methods |
Rimpas et al., 2022 [76] | Interpret OBD-II parameters into meaningful events, correlate with fuel consumption and characterize driving style | Rule-based approach for event interpretation and fuel consumption | Korean dataset | Short-term fuel trim, MAP, absolute throttle position, rpm, calculated engine load, ECT, speed, and catalytic converter temperature | Accuracy of event interpretation | Gear change identification accuracy, acceleration identification accuracy, and idle identification accuracy |
Divyasri et al., 2024 [77] | Driving behavior classification into safe or unsafe mode | Boosting algorithms including CatBoost, AdaBoost, LightGBM, GradientBoost, and XGBoost | Korean dataset conditions: urban and suburban environment, ten drivers | Korean dataset vehicles | Engine speed, engine torque, throttle position, vehicle speed, and steering wheel angle | ROC analysis, Accuracy, Precision, Recall, F1-score, and confusion matrices |
Kumar et al., 2022 [79] | Classify and analyze driving behavior | Transformation of OBD-II primary data into secondary data (e.g., revving, stability, etc). SVM, AdaBoost, and RF | Korean dataset conditions: urban and suburban environment, ten drivers | Korean dataset vehicles | Fifty parameters, including speed, motor RPM, paddle position, calculated engine load | Accuracy of classification into ten classes |
5.3. Driver Identification
Reference | Objective | ML Models | Driving Environment | Type of Vehicle | OBD or OBD-Derived Parameters | Performance Metrics |
---|---|---|---|---|---|---|
Govers et al., 2024 [80] | Driver identification | Bidirectional LSTM with Attention (BiLSTM-A) and Modified Time Series Transformer (MTST) | Korean dataset | Korean vehicles | Propulsion-system-independent features (14 features) | Accuracy in driver classification based on monitored duration. |
Singh et al., 2024 [81] | Driving behaviour capturing and profiling with the objective to identify | LSTM for behaviour evolution | Korean dataset | Korean vehicles | 15 OBD-II features selected based on Maximum Relevance Minimum Redundancy | Accuracy in driver identification. |
Khan et al., 2023 [82] | Driver identification | Naive Bayes, LR, kNN, REP Tree, and SVM | Korean dataset, 10 drivers in similar trips (covering 23 km) | Korean vehicles | 15 features (out of 51) including fuel trim, air pressure | Accuracy of driver identification. |
Manderna et al., 2022 [83] | Driver identification | LSTM | Korean dataset, 10 drivers in similar trips (covering 23 km) | Korean vehicles | 53 OBD-II features have been employed after preprocessing | Accuracy, precision, recall, and F1-score, compared with kNN, SVM, DTs, MLP. |
6. Anomaly Detection and Cybersecurity Issues
Reference | Objective | ML Models | Driving Environment | Type of Vehicular Data Issues and Threats | OBD or OBD-Derived Parameters | Performance Metrics |
---|---|---|---|---|---|---|
Andrade et al., 2024 [84] | Detection and correction of outliers in real-time OBD-II data streams on resource-constrained edge devices | TEDA and RLS | Speed variations, particularly in areas with speed bumps where speed reductions are required | Presence of outliers in vehicle sensor data that can impact monitoring, predictions, and decision-making | Speed data | Accuracy, F1 score, and recall for outlier detection, RMSE and MAE for outlier correction performance. |
Dini et al., 2023 [85] | Real-time detection of cyber-attacks on the CAN bus using ML-based ECU fingerprinting | ANNs | Simulated CAN network environment | Unauthorized access, replay attacks. DoS, spoofing attacks, physical layer attacks | CAN bus voltage signal fingerprinting | Accuracy of anomaly detection, classification performance of known vs. unknown ECUs, robustness to temperature variations. |
Malik et al., 2023 [86] | Anomaly detection in ransomware propagation | kNN | Simulated environments, particularly ride-hailing services with EVs moving through urban areas | Ransomware threats in connected vehicles, including hotspot-based malware infections, OBD dongle-based infections, and malicious OTA updates | Vehicle speed, vehicle acceleration/deceleration | Infections per minute, system efficiency (ratio of completed trips to total trips), financial impact (loss in earnings per hour), update convergence (percentage of EVs receiving software updates). |
Aloqaily et al., 2025 [87] | In-vehicle communication security through an IDS specifically designed for the CAN-bus, focusing on detecting unusual patterns | DTs, RF, Naïve Bayes, LR, XGBoost, LightGBM, and MLP | Connected and autonomous vehicles operating in real-time environments | DoS, fuzzy attacks, RPM spoofing, gear spoofing, replay and impersonation attacks | CAN-bus data | Accuracy, precision, recall, F1-Score, FPR, and FNR. |
El-Gayar et al., 2024 [88] | Intrusion detection in IoV systems, addressing vulnerabilities to cyber-attacks with high accuracy and low false negatives | RF, ET, LightGBM, and XGBoost | Connected and autonomous vehicles within the IoV ecosystem, with a focus on both intra-vehicle and inter-vehicle communication | DoS, DDoS, fuzzy attacks, spoofing (gear and gauge), false data dissemination, and sybil attacks | CAN traffic, including features such as timestamp, CAN ID, DLC data bytes, and CAN packet | Accuracy, precision, recall, F1-Score, and execution Time. |
7. Intelligent Road Perception and Driving Support
Reference | Objective | ML Models | Driving Environment and Type of Road | OBD or OBD-Derived Parameters | Performance Metrics |
---|---|---|---|---|---|
Sabapathy et al., 2023 [89] | Low-cost solution for road surface classification using standardized OBD-II accelerometer data | CNN, ordinal LR, SVM, and ANN | Naturalistic driving conditions, Asphalt roads, classified into three categories: ‘Good’, ‘Fair’, and ‘Poor’. | 3-axis accelerometer data (X, Y, Z axes), vehicle speed, GPS data | Overall classification accuracy, accuracy per road class (Poor, Fair, Good) |
Xiao et al., 2022 [91] | Low-cost and user-friendly method for large-scale car trajectory data acquisition in urban environments, especially during GPS outages | GPR, R2C with Learn++ | Real-world urban environments with complex conditions like straight roads, highways, turns, and viaducts | Velocity, steering direction, RPM, GPS position (latitude, longitude) | RMSE, AE, accuracy of predicted trajectories |
Flores Fernández et al., 2023 [92] | Highly accurate velocity correction data, ensuring accurate vehicle state estimation for autonomous driving and ADAS in environments with limited SatNav coverage | TNN | Test track, primarily asphalt roads, simulating real-world driving conditions, with varied speeds (0–50 km/h) and diverse driving patterns | Vehicle velocity, individual wheel velocities, steering angle | MAE, RMSE, and inference time |
8. Lessons Learned and Research Insights
- Fuel and Energy Monitoring, Estimation, and Optimization
- -
- Feature Importance: Multiple studies identify OBD-II parameters such as engine load, vehicle speed, coolant temperature, and short-/long-term fuel trims as strong predictors for accurate fuel and energy estimation.
- -
- Model Selection Tradeoffs: RF algorithms consistently outperform traditional ANNs in real-world HDDV datasets due to better accuracy and interpretability. However, LSTM networks provide superior performance in dynamic environments where time dependencies are significant. The tradeoff lies in LSTMs’ higher computational cost and data requirements, making RF more suitable for real-time and resource-constrained settings.
- -
- Behavioral and Environmental Generalization: Robustness to diverse driving behaviors and environmental factors (e.g., road topology, traffic, weather) remains a key challenge. Urban driving introduces high variance, which limits model transferability. Methods such as domain adaptation, transfer learning, and data augmentation are underutilized but promising.
- -
- Vehicle-Specific Characteristics: In heavy-duty vehicles, propulsion energy is heavily influenced by real-time weight/load variations. Incorporating vehicle mass into models improves precision, while neglecting it—especially during frequent load/unload transitions—leads to substantial estimation errors.
- -
- Edge-Based Energy Feedback: ML models deployed on embedded or smartphone-based edge devices provide real-time energy feedback without cloud dependency. Benefits include reduced latency, enhanced data privacy, and lower communication costs, facilitating personalized eco-driving systems via platforms such as TensorFlow Lite.
- -
- Infrastructure-Aware Estimation: Including road infrastructure features (e.g., traffic signals, intersections, elevation) significantly enhances model accuracy. LSTM-based models are especially effective at capturing these context-driven patterns, particularly in mixed urban–rural routes.
- -
- Scalable Deployment: The low cost and plug-and-play nature of OBD-II ports support scalable eco-driving deployments across personal and fleet vehicles when paired with Bluetooth and mobile/cloud apps. While cloud synchronization is common, local computation offers faster and more privacy-preserving solutions.
- Emission Control and Environmental Impact
- -
- Model Performance and Generalizability: ML models trained on large OBD-II datasets enable accurate real-time estimation of emissions such as NOx, CO2, and particulate matter. However, generalizability across vehicle types, conditions, and regions remains a challenge due to dataset and sensor variability.
- -
- NOx Emissions Insights: NOx emissions are primarily driven by high-temperature combustion, engine load, gradient, and SCR efficiency. While EGR reduces NOx, it may increase particulate emissions and degrade downstream components such as DPFs. ML-based monitoring (e.g., RF, Seq2Seq) outperforms threshold-based OBD-II methods in detecting SCR faults and predicting NOx spikes.
- -
- Soft Sensor Integration: Virtual sensors built using ML provide a cost-effective alternative to physical emission sensors, enabling closed-loop control of SCR and DPF systems without the need for extensive lab calibration.
- -
- CO2 Emissions and Fuel Type: Flexible-fuel hybrid vehicles using ethanol emit less CO2 than gasoline counterparts. ML models can optimize fuel-switching strategies to maximize environmental benefits based on driving conditions.
- -
- Black Carbon and Load Dependence: BC emissions from gasoline direct-injection engines rise nonlinearly with engine speed and load. Effective mitigation requires a combination of combustion system design and post-treatment (e.g., gasoline particulate filters).
- -
- Regulatory Compliance: ML-based emission estimators outperform traditional tools such as COPERT, supporting compliance with modern emission standards through higher estimation accuracy.
- Driving Behavior and Driver Analysis
- -
- Behavioral Feature Extraction: Raw OBD-II signals (e.g., RPM, throttle position, acceleration) can be transformed into high-level driving events (e.g., hard braking, sharp turning) via heuristics or supervised ML, enabling profiling across a behavioral spectrum from conservative to aggressive.
- -
- Contextual Data Fusion: Integrating contextual factors (e.g., time of day, road type, and weather) improves behavior classification. For instance, braking on a highway carries different implications than in city traffic. Context-aware classifiers are better able to handle such nuance.
- -
- Safety Feedback Loops: ML models can detect unsafe driving patterns in real time and provide feedback to drivers. Certain systems gamify performance metrics to encourage safer and more efficient driving habits.
- -
- Driver Identification and Authentication: RNNs and LSTM models can identify drivers by using micro-patterns in throttle/brake usage and gear shifts. While promising, intra-driver variability makes this task harder than simple behavior classification.
- -
- Vehicle Dependency: Driver analysis models are sensitive to vehicle-specific attributes such as powertrain type, fuel system, and model year. Many studies overlook these factors, potentially affecting cross-vehicle generalization.
- -
- Data Scarcity and Quality: Most commercial OBD-II devices have low sampling rates, and many studies rely on small datasets; thus, building large, high-resolution, and vehicle-specific datasets is critical for improving model robustness.
- -
- Data Integrity and Security: As OBD-II-based behavior profiling gains adoption in insurance and legal domains, data authenticity becomes crucial. Blockchain-based methods offer promising solutions to prevent tampering and spoofing.
- Vehicle Health Monitoring, Anomaly Detection, and Cybersecurity
- -
- Anomaly Detection in Real Time: Lightweight techniques (e.g., autoencoders, clustering, and statistical profiling) are commonly used to detect real-time anomalies in OBD-II signals, ensuring timely alerts without overwhelming computation.
- -
- Edge Deployment of Security Models: Implementing anomaly/intrusion detection directly on embedded platforms reduces latency, preserves bandwidth, and ensures localized response, all of which are vital for safety-critical automotive systems.
- -
- Cyberattack Scenarios and Resilience: Simulated attack scenarios (e.g., false injection, message delay, spoofing) demonstrate that ML classifiers trained on time series data can differentiate between legitimate and malicious activity. Evaluation setups with embedded microcontrollers (e.g., NXP S32K144) validate feasibility, with detection accuracy exceeding 98% across temperature ranges (24–83 °C).
- Intelligent Road Perception and Driving Support
- -
- OBD-II and ML for Road Monitoring and Vehicle Trajectory: ML models using OBD-II data (e.g., accelerometer, speed) can classify road conditions and support trajectory estimation, which is especially useful in GPS-denied environments when combined with inertial sensors.
- -
- Challenges in Data Handling and Model Choice: Accurate road classification requires careful preprocessing such as geofencing and sensor normalization. Traditional ML (e.g., SVM) struggles with these tasks, while DL models (e.g., CNNs) perform better; however, these models require mitigation techniques to prevent overfitting and ensure generalizability.
- -
- ADAS Enhancement in GPS-Limited Areas: Integrating OBD-II with other vehicle sensors improves ADAS features such as collision avoidance and lane keeping by ensuring reliable state estimation even when GPS signals are lost.
- -
- Scalability and Real-World Validation: Field tests confirm OBD-II’s practicality for large-scale road monitoring. Large and diverse datasets are essential for ensuring ML model performance across varied conditions.
- -
- Dynamic Adaptation to Evolving Conditions: Incremental learning allows ML models to adapt to changes in road conditions and driving styles over time, supporting robust real-time performance in dynamic environments.
Effectiveness of ML Models
- Sequential DL Models (LSTM, GRU, Transformers): These models are especially effective in environments characterized by temporal variability and high-frequency sensor data. LSTM and GRU architectures outperform classical models in tasks such as fuel consumption estimation and driver behavior profiling in urban traffic and variable terrain, where stop-and-go dynamics and frequent elevation changes are prevalent. Hybrid LSTM-Conv and attention-augmented models (e.g., GRU + attention) further improve accuracy by focusing on key time segments and contextual dependencies. Transformers, with their long-range temporal modeling capability, are best suited for SatNav-denied conditions such as tunnels or urban canyons, where they enable precise vehicle velocity corrections using inertial and OBD-derived data.
- Hybrid and Feature Fusion Architectures: Models that combine sequence learning with feature selection or optimization, such as PA-LSTM or LSTM with PSO, have shown strong performance in multi-pollutant emission prediction. These architectures effectively handle heterogeneous input streams such as SCR temperature, fuel rate, and load conditions, resulting in enhanced prediction accuracy under different driving patterns.
- Ensemble Methods (RF, Gradient Boosting, SVR): Ensemble models excel in highway and structured driving environments where driving behavior is smoother and less stochastic. RF and SVR provide accurate estimations for fuel efficiency, emission compliance, and anomaly detection due to their robustness and interpretability. Moreover, in post-crash and safety profiling, classifiers such as LightGBM and RF are frequently employed to classify crash severity and abnormal driving behaviors from OBD logs and sensor snapshots, offering explainable and efficient inference.
- CNNs and GPR-Based Ensembles: CNNs are well suited for road surface classification tasks, particularly when raw sensor signals (e.g., accelerometers or vibration data) are available. They offer automatic hierarchical feature extraction, which improves detection of poorly maintained road segments. In trajectory prediction under GPS signal loss, ensemble models built on GPR with incremental learning capabilities provide strong adaptability to changing vehicle dynamics and can outperform traditional dead reckoning methods.
- Lightweight and Incremental Models (TEDA-RLS, kNN, SVM, ANNs): These models are ideal for resource-constrained deployments such as low-power ECUs or embedded edge devices. TEDA-RLS supports real-time anomaly correction with minimal latency and memory usage, making it suitable for in-vehicle diagnostics and early fault detection. Similarly, shallow ANNs, SVMs, and kNN classifiers are used in scenarios such as outlier detection and cybersecurity threat identification (e.g., ransomware detection via CPU/memory patterns), offering acceptable accuracy without sacrificing responsiveness.
Optimization Target | Suggested Method | Key Performance Metrics |
---|---|---|
Fuel Consumption (HDTs) | LSTM-Conv | MAPE: 9.81% (FCR), 1.49% (trip fuel economy) |
Fuel Consumption (Light Vehicles) | Elman Neural Network | RMSE: 3.672 L/km, : 98.27% |
SOC/RDR for EVs | Nonlinear SVR with RBF Kernel | R2: 0.95, MAE: 2.4% |
NOx Emissions | Seq2Seq neural network (Bi-GRU with Attention, ITL) | RMSE reduction: 62% overall, 73.6% high-emission |
CO2 Emissions | XGBoost | R2: 0.942 (HDDVs), 0.981 (UHDVs) |
Multi-Pollutant Emissions | PA-LSTM | Strong RMSE, MAE, MAPE, R2 |
Safety Monitoring | J48 DT | Accuracy: 100%, Precision: 100%, Recall: 100% |
Driving Behavior Profiling | RF | Accuracy: 100% |
Driver Identification | LSTM | Accuracy: >99%, high precision/recall/F1 |
Anomaly Detection | TEDA-RLS | MAE: 22.23, RMSE: 124.39, 243.63 μs/cycle |
Cybersecurity | RF, LightGBM | Accuracy: 99.9% |
Road Surface Classification | CNN | Accuracy: 65.6%, Poor: 53.5%, Good: 27.5% |
Trajectory Prediction | GPR with R2C and Learn++ | RMSE: 6–10 m |
Velocity Correction | TNN | MAE: 0.167 km/h, RMSE: 0.240 km/h |
9. Challenges, Gaps, and Future Research Directions
9.1. Challenges and Gaps in ML-Enabled OBD-II Systems
- Data Quality, Preprocessing, and Generalization
- -
- Sensor degradation, noise, and latency: OBD-II signals are often affected by measurement errors, especially under harsh conditions; for example, NOx sensors have shown over 40% error rates in some studies. These signal imperfections reduce model reliability and predictive performance.
- -
- Biased and limited datasets: Many models are trained on narrowly scoped datasets that do not reflect the variability in vehicle types, road conditions, or climate zones. This limits generalization to diverse operational settings, highlighting the need for broader and more heterogeneous datasets.
- -
- Need for robust preprocessing: Advanced preprocessing techniques such as denoising, interpolation, and cross-sensor validation are essential for cleaning OBD-II inputs prior to learning. These must also be coupled with domain-specific feature engineering and large-scale dataset curation covering heterogeneous platforms.
- Scalability, Efficiency, and Model Optimization
- -
- Embedded hardware limitations: Deep models such as MLPs, CNNs, and LSTMs often exceed the processing and memory capabilities of in-vehicle ECUs. Lightweight alternatives such as RF, LightGBM, and DFSENet are more suitable for real-time edge deployment. Edge-aware pruning, quantization, and distillation techniques are also promising for deployment in constrained environments.
- -
- Model complexity vs. accuracy tradeoffs: Achieving high accuracy while maintaining computational efficiency remains a core challenge, particularly for latency-sensitive applications such as anomaly detection and eco-driving feedback.
- -
- Protocol fragmentation and interoperability: Interoperability challenges primarily stem from physical and electrical variations across protocols (e.g., ISO 15765-4 CAN, older variants), especially in legacy or EVs that may lack full OBD-II support. Parameter heterogeneity due to manufacturer-specific PIDs, undocumented scaling factors, and varying byte formats limits model generalizability and functionality. Additionally, DTCs are split into standardized and proprietary forms, restricting access to advanced diagnostic insights. Operational factors such as ECU polling frequencies, protocol bandwidths, and request intervals further impact data consistency and analytics.
- Computational Constraints, Sensor Fusion, and System Integration
- -
- Real-time inference bottlenecks: Cloud-dependent ML models introduce latency and rely on stable connectivity, which is not guaranteed in all vehicular environments.
- -
- Underutilized sensor fusion: OBD-II data remain largely decoupled from other vehicular sensor streams. Synchronization of multi-modal data for tasks such as road condition detection or trajectory estimation remains a challenge due to bandwidth constraints and protocol incompatibility.
- -
- Integration with ADAS: Combining OBD-II analytics with ADAS poses significant challenges due to their fundamentally different architectures and requirements. OBD-II is ECU-centric, polled via a central gateway, and operates at low data rates (1–10 Hz), while ADAS is sensor-centric and its features (e.g., lane-keeping, adaptive cruise control) require high-frequency low-latency streaming from cameras, radars, and other sensors, often exceeding tens or hundreds of Hz. OBD-II primarily supports emissions and diagnostics, lacking access to dynamic vehicle data such as yaw rate, steering angle, and object detection which are crucial for ADAS. Although centralized gateways using FlexRay or Controller Area Network with Flexible Data-rate (CAN-FD) protocols [93] can facilitate low-latency fusion for enhanced situational awareness, coordination between disparate software stacks (OBD-II, ADAS, infotainment) remains a major integration barrier.
- -
- Fleet-level deployment and management: In fleet scenarios, scalability is further challenged by the heterogeneity of vehicle models, sensor aging, and driver behavior. Centralized ML model training with distributed inference, edge–cloud orchestration, and over-the-air updates require robust communication and system security protocols.
- Cybersecurity, Privacy, and Regulatory Barriers
- -
- Expanded attack surfaces: Diagnostic interfaces (OBD-II ports) and wireless connectivity expose vehicles to cyber threats such as spoofing, ransomware, and message injection. Current rule-based IDSs are often inadequate for detecting novel or zero-day attacks. ML-based intrusion detection and trust-aware communication layers are currently active areas of research.
- -
- Data privacy and ethical compliance: The use of behavioral and location data for model training introduces privacy risks, particularly when transmitted or processed externally. OBD-II applications capable of identifying individual drivers must comply with the EU’s General Personal Data Protection (GDPR) regulation, which requires a lawful basis for processing (e.g., consent or legitimate interest), purpose limitation, and data minimization. Real-time tracking via telemetry or cloud-based services may fall under the scope of the EU’s ePrivacy Directive [94], especially if in-vehicle communications are monitored. Emerging techniques such as Secure Multiparty Computation (SMPC) [95], differential privacy [96], and federated learning (FL) [97] offer promising privacy-preserving mechanisms.
- -
- Legal barriers to cloud integration: Transferring vehicle data to the cloud raises legal concerns over cross-border data flows, ownership, and liability for system failures, especially when third-party services are used to handle emissions or safety-critical analytics.
- -
- Lack of certification pathways: Despite high prediction performance, many ML models lack transparency, reproducibility, or explainability, which are necessary factors for legal and regulatory approval. Moreover, exposing safety-critical ADAS data over diagnostic interfaces such as OBD-II introduces security and functional safety concerns, as ADAS systems must comply with standards such as ISO 26262 [98] that require rigorous validation beyond the scope of diagnostic protocols. Standardized data formats and auditability are also needed for critical tasks such as emission verification and accident analysis.
- Emission Monitoring, Eco-Driving, and Smart Mobility
- -
- Under-contextualized emission models: Emission models trained on short-duration or context-poor datasets may fail to generalize across traffic types, road profiles, and ambient conditions. These models also struggle with cold-start effects and nonlinear behaviors in fuel consumption.
- -
- Lack of real-time feedback systems: Most eco-driving systems do not provide personalized context-aware feedback during vehicle operation. There is a need for ADAS-embedded ML platforms that can interpret OBD-II signals in real time and offer actionable guidance to drivers while avoiding distractions and ensuring interpretability.
- -
- Smart city integration challenges: Scaling ML models for use in fleet management or city-wide traffic optimization is hindered by high data throughput and computational overhead. Efficient representations such as trajectory compression and semantic road classification are necessary for cloud or mobile deployment. In addition, interoperability with municipal platforms and standards (e.g., DATEX II, ITS-G5 [99]) remains a challenge.
9.2. Future Research Directions
- Public OBD-II Datasets and Benchmarking: A major barrier to reproducible and comparable research is the scarcity of publicly available, large-scale, and diverse OBD-II datasets. Future efforts must prioritize the creation and open sharing of well-annotated datasets spanning multiple vehicles, fuel types, and real-world driving conditions. Standardized benchmarks and challenge platforms inspired by industry-led initiatives such as those for connected and autonomous vehicles [100] are needed in order to evaluate anomaly detection, fault diagnosis, and predictive maintenance methods. Such platforms can facilitate collaboration between academia and industry, helping to ensure that datasets reflect real-world fleet management needs.
- Data Quality, Preprocessing, and Generalization: Building robust ML systems begins with high-quality and reliable data. Future work should develop comprehensive preprocessing pipelines including sensor signal denoising, time-series interpolation, and outlier rejection that are specifically tailored to the noisy and heterogeneous nature of OBD-II signals. To enhance generalization, datasets must cover broad operational domains (e.g., urban, highway, cold starts) and integrate contextual data such as weather, traffic, and digital elevation maps which are critical for industry applications such as emissions modeling. Industry adoption of edge computing for real-time data preprocessing can further improve data quality by reducing latency and enabling onboard noise filtering, aligning with the computational constraints of automotive platforms.
- Lightweight and Efficient Model Design for Onboard Deployment: Given the stringent computational and memory constraints of embedded automotive platforms, there is a pressing need for lightweight ML and DL models optimized for real-time onboard execution. Research should focus on model compression techniques such as pruning, quantization, and knowledge distillation as well as on the exploration of hybrid approaches combining classical ML and shallow neural networks. The industry’s shift toward edge computing necessitates models that operate efficiently on in-vehicle hardware and can support applications such as predictive maintenance and eco-driving feedback. Dynamic model adaptation techniques that scale complexity based on task demands or available resources can enable efficient and adaptive vehicle intelligence.
- Protocol Standardization, Interoperability, and Emerging Access Methods: Fragmentation of OBD-II implementations and proprietary protocols limits large-scale analytics and model portability. Future research should support efforts to standardize data exchange protocols (e.g., ISO 15031, SAE J1939) and develop platform-agnostic APIs and middleware that abstract protocol-specific details. This is critical for industry applications such as scalable fleet management and OTA diagnostics. Moreover, the OBD-II interface itself is evolving, particularly for EVs, where data access increasingly occurs via internal APIs or telematics platforms accessed through the Android Automotive operating system (OS), Apple CarPlay, or manufacturer-specific apps (e.g., FordPass, TeslaFi). Standardized protocols and open APIs are essential for enabling scalable cross-brand analytics and diagnostics in electrified vehicle fleets.
- Sensor Fusion and ADAS Integration: Enhancing predictive accuracy and situational awareness requires fusing OBD-II data with other vehicular and environmental sensor inputs. Future systems should employ structured multimodal learning approaches that can align temporally and spatially disparate data streams via synchronization protocols and centralized gateways such as FlexRay. Data fusion can improve applications such as road surface classification, autonomous navigation in GPS-denied zones, and adaptive safety interventions. In addition, industry adoption of digital twins [101] can further enhance sensor fusion and support ADAS by simulating real-time vehicle dynamics.
- Explainable and Trustworthy AI: As ML systems increasingly influence safety-critical decisions, their outputs must be made interpretable and auditable. Future research should incorporate explainability tools such as SHAP, LIME [102], and counterfactual reasoning in order to visualize and justify model behavior. Hybrid architectures combining rule-based logic with learned representations can enhance user trust and align with regulatory requirements. Ensuring transparency, consistency, and reproducibility of ML outputs is essential for long-term system certification, particularly for industry stakeholders such as automakers and regulators seeking auditable diagnostic systems.
- Privacy-Preserving ML: The deployment of ML raises legitimate concerns over the privacy of location, behavior, and biometric data. Federated learning (FL) offers a compelling paradigm in which models are trained locally on-device and only aggregated updates are shared with a central server. Future work should focus on optimizing FL architectures for vehicular environments while incorporating differential privacy mechanisms and blockchain-based audit trails to prevent tampering and unauthorized data inference. These privacy-preserving techniques are critical for industry applications, where they enable secure driver profiling and fleet analytics while ensuring compliance with data protection regulations.
- Resilient Cybersecurity Systems: As vehicle systems become increasingly connected, they face growing exposure to cyber threats. Future research should emphasize the development of adaptive and resilient IDS capable of learning incrementally and detecting zero-day attacks. ML techniques such as ensemble-based anomaly detection, adversarial training, and unsupervised clustering can be used to identify and mitigate novel attack patterns in real time. Combining behavioral analytics with network traffic monitoring can yield more comprehensive protection.
- Context-Aware Emission Modeling: Accurately estimating vehicular emissions in real-world driving conditions remains a key challenge due to non-stationary factors (e.g., cold starts, stop-and-go traffic, and elevation changes). Future models should incorporate contextual data such as digital elevation maps, ambient temperature, and traffic congestion into prediction pipelines. Advanced temporal architectures such as attention-based LSTMs and transformer models can be leveraged for more effective modeling of nonlinear and time-dependent relationships. This is particularly essential for ensuring industry compliance with evolving regulations such as Euro 7 and China-VI.
- Scalable Eco-Driving and Smart Mobility Tools: Future systems should integrate ML models into cloud-based or edge-deployed ADAS platforms to provide real-time eco-driving recommendations, predictive maintenance alerts, and adaptive route planning. Techniques such as trajectory compression, semantic road classification, and low-power analytics can make these solutions scalable in smart city ecosystems. Personalization based on driving behavior, vehicle type, and traffic context will be key to maximizing energy savings and user engagement.
10. Conclusions
Author Contributions
Funding
Conflicts of Interest
Abbreviations
ABS | Anti-lock Braking System |
ADAS | Advanced Driver Assistance Systems |
ADC | Analog-to-Digital Converter |
AGL | Automotive-Grade Linux |
AI | Artificial Intelligence |
ANN | Artificial Neural Network |
API | Application Programming Interface |
ARIMA | Autoregressive Integrated Moving Average |
ASW | Adaptive Sequence Window |
BC | Black Carbon |
BiGRU | Bidirectional GRU |
BiLSTM | Bidirectional Long Short-Term Memory |
BPSS | Brake Pedal Position Sensor |
CAN | Controller Area Network |
CAN-FD | Controller Area Network with Flexible Data-rate |
CART | Classification and Regression Tree |
CAV | Connected and Autonomous Vehicle |
CBAM | Convolutional Block Attention Module |
CCPS | Crankshaft and Camshaft Position Sensors |
CEL | Check Engine Light |
CIDS | Collaborative Intrusion Detection System |
CNN | Convolutional Neural Network |
COG | Course Over Ground |
CSV | Comma-Separated Values |
DA-RNN | Dual-Stage Attention-Based RNN |
DBSCAN | Density-Based Spatial Clustering of Applications with Noise |
DDD | Driver Drowsiness Detection |
DT | Decision Tree |
DEM | Digital Elevation Model |
DFSENet | Dynamic Forest-Structured Ensemble Network |
DL | Deep Learning |
DLC | Data Length Code |
DNN | Deep Neural Network |
DOC | Diesel Oxidation Catalyst |
DoS | Denial-of-Service |
DPF | Diesel Particulate Filter |
DPP | Dew Point Protection |
DTC | Diagnostic Trouble Code |
ECU | Electronic Control Unit |
EGR | Exhaust Gas Recirculation |
EGT | Exhaust Gas Temperature |
EKF | Extended Kalman Filtering |
EOP | Engine Output Power |
EPB | Electronic Parking Brake |
ESC | Electronic Stability Control |
ET | Extra Trees |
EV | Electric Vehicle |
FCR | Fuel Consumption Rate |
FFB | Feed-Forward Backpropagation |
FFT | Fast Fourier Transform |
FNR | False Negative Rate |
FPR | False Positive Rate |
GBDT | Gradient-Boosting Decision Tree |
GBM | Gradient-Boosting Machine |
GDPR | General Personal Data Protection |
GMM | Gaussian Mixture Model |
GPR | Gaussian Process Regression |
GPS | Global Positioning System |
GRU | Gated Recurrent Unit |
GUI | Graphical User Interface |
HC | Hydrocarbon |
HDDV | Heavy-Duty Diesel Vehicle |
HDT | Heavy-Duty Truck |
HEV | Hybrid Electric Vehicle |
IAT | Intake Air Temperature |
IDS | Intrusion Detection System |
IMU | Inertial Measurement Unit |
INS | Inertial Navigation System |
IoV | Internet of Vehicles |
IPFS | Inter-Planetary File System |
ITS | Intelligent Transportation Systems |
ITL | Incremental Tracking Loss |
JSON | JavaScript Object Notation |
kNN | k-Nearest Neighbors |
KPCA | Kernel Principal Component Analysis |
LDGV | Light-Duty Gasoline Vehicle |
LightGBM | Light Gradient-Boosting Machine |
LR | Logistic Regression |
LSTM | Long Short-Term Memory |
MaaS | Mobility-as-a-Service |
MAE | Mean Absolute Error |
MAP | Manifold Absolute Pressure |
MAPE | Mean Absolute Percentage Error |
MAF | Mass Air Flow |
MIL | Malfunction Indicator Lamp |
ML | Machine Learning |
MLP | Multi-Layer Perceptron |
MLR | Multiple Linear Regression |
mRMR | Maximum Relevance Minimum Redundancy |
MRAE | Mean Relative Absolute Error |
MSE | Mean Squared Error |
MQTT | Message Queuing Telemetry Transport |
MTST | Modified Time Series Transformer |
NMAE | Normalized Mean Absolute Error |
NNMF | Non-Negative Matrix Factorization |
OBD | On-Board Diagnostics |
OD | Origin–Destination |
OS | Operating System |
OTA | Over-the-Air |
PA-LSTM | Parallel Attention-Based Long Short-Term Memory |
PASER | Pavement Surface Evaluation and Rating |
PCA | Principal Component Analysis |
PCM | Powertrain Control Module |
PEMS | Portable Emission Measurement System |
PHEV | Plug-in Hybrid Electric Vehicle |
PID | Parameter ID |
PIE | Pedestrian Intention Estimation |
PN | Particulate Number |
PPCA | Probabilistic Principal Component Analysis |
PSO | Particle Swarm Optimization |
PTW | Powered Two-Wheeler |
PVT | Probe Vehicle Trajectory |
R2C | Regression-to-Classification |
RBF | Radial Basis Function |
RDC | Real-World Driving Cycle |
RDE | Real Driving Emissions |
RDR | Remaining Driving Range |
REMVT | Remote Emission Management Vehicle Terminal |
REP | Reduced Error Pruning |
RF | Random Forest |
RL | Reinforcement Learning |
RLS | Recursive Least Squares |
RMSE | Root Mean Square Error |
RNN | Recurrent Neural Network |
ROC | Receiver Operating Characteristic |
ROS | Remote Operating System |
RPM | Revolutions Per Minute |
SatNav | Satellite Navigation |
SCR | Selective Catalytic Reduction |
Seq2Seq | Sequence-to-Sequence |
SGD | Stochastic Gradient Descent |
SMAPE | Symmetric Mean Absolute Percentage Error |
SMOTE | Synthetic Minority Oversampling Technique |
SMPC | Secure Multi-Party Computation |
SNR | Signal-to-Noise Ratio |
SoC | State of Charge |
SUMO | Simulation of Urban Mobility |
SUV | Sport Utility Vehicle |
SVR | Support Vector Regression |
SVM | Support Vector Machine |
TAQ | Traffic–Air Quality |
TEDA | Typicality and Eccentricity Data Analytics |
TFT | Transmission Fluid Temperature |
THC | Total Hydrocarbon |
TinyML | Tiny Machine Learning |
TNN | Transformer Neural Network |
TPMS | Tire Pressure Monitoring |
TPS | Throttle Position Sensor |
TSS | Transmission Input/Output Shaft Speed |
UDS | Unified Diagnostic Services |
UHDV | Ultra-Heavy-Duty Vehicle |
USB | Universal Serial Bus |
V2V | Vehicle-to-Vehicle |
VSP | Vehicle-Specific Power |
VSS | Vehicle Speed Sensor |
VT-Micro | Virginia Tech Microscopic |
Wi-Fi | Wireless Fidelity |
WSS | Wheel Speed Sensor |
XAI | Explainable Artificial Intelligence |
XGBoost | Extreme Gradient Boosting |
ZEV | Zero-Emission Vehicle |
References
- Oladimeji, D.; Gupta, K.; Kose, N.A.; Gundogan, K.; Ge, L.; Liang, F. Smart Transportation: An Overview of Technologies and Applications. Sensors 2023, 23, 3880. [Google Scholar] [CrossRef]
- Rimpas, D.; Papadakis, A.; Samarakou, M. OBD-II Sensor Diagnostics for Monitoring Vehicle Operation and Consumption. Energy Rep. 2020, 6 (Suppl. S3), 55–63. [Google Scholar] [CrossRef]
- Campos-Ferreira, A.E.; Lozoya-Santos, J.d.J.; Tudon-Martinez, J.C.; Mendoza, R.A.R.; Vargas-Martínez, A.; Morales-Menendez, R.; Lozano, D. Vehicle and Driver Monitoring System Using On-Board and Remote Sensors. Sensors 2023, 23, 814. [Google Scholar] [CrossRef]
- Kim, B.; Baek, Y. Sensor-Based Extraction Approaches of In-Vehicle Information for Driver Behavior Analysis. Sensors 2020, 20, 5197. [Google Scholar] [CrossRef]
- Meckel, S.; Schuessler, T.; Jaisawal, P.K.; Yang, J.-U.; Obermaisser, R. Generation of a diagnosis model for hybrid-electric vehicles using machine learning. Microprocess. Microsyst. 2020, 75, 103071. [Google Scholar] [CrossRef]
- Yan, X.; Li, M.; Chen, H.; Wang, J.; Zhang, Z. An Online Learning Framework for Sensor Fault Diagnosis Analysis in Autonomous Cars. IEEE Trans. Intell. Transp. Syst. 2023, 24, 14467–14479. [Google Scholar] [CrossRef]
- Gong, C.-S.A.; Su, C.-H.S.; Chen, Y.-H.; Guu, D.-Y. How to Implement Automotive Fault Diagnosis Using Artificial Intelligence Scheme. Micromachines 2022, 13, 1380. [Google Scholar] [CrossRef]
- Zhao, J.; Qu, X.; Wu, Y.; Fowler, M.; Burke, A.F. Artificial intelligence-driven real-world battery diagnostics. Energy AI 2024, 18, 100419. [Google Scholar] [CrossRef]
- Alqarqaz, M.; Bani Younes, M.; Qaddoura, R. An Object Classification Approach for Autonomous Vehicles Using Machine Learning Techniques. World Electr. Veh. J. 2023, 14, 41. [Google Scholar] [CrossRef]
- Tahir, H.A.; Alayed, W.; Hassan, W.U.; Haider, A. A Novel Hybrid XAI Solution for Autonomous Vehicles: Real-Time Interpretability Through LIME–SHAP Integration. Sensors 2024, 24, 6776. [Google Scholar] [CrossRef]
- Yeong, D.J.; Panduru, K.; Walsh, J. Exploring the Unseen: A Survey of Multi-Sensor Fusion and the Role of Explainable AI (XAI) in Autonomous Vehicles. Sensors 2025, 25, 856. [Google Scholar] [CrossRef] [PubMed]
- Wang, Q.; Chen, S.; Zeng, J.; Du, W.; Wei, L. A deep learning fault diagnosis method for metro on-board detection on rail corrugation. Eng. Fail. Anal. 2024, 164, 108662. [Google Scholar] [CrossRef]
- Chegini, S.N.; Bagheri, A.; Najafi, F. Application of a new EWT-based denoising technique in bearing fault diagnosis. Measurement 2019, 144, 275–297. [Google Scholar] [CrossRef]
- Visconti, P.; Rausa, G.; Del-Valle-Soto, C.; Velázquez, R.; Cafagna, D.; De Fazio, R. Innovative Driver Monitoring Systems and On-Board-Vehicle Devices in a Smart-Road Scenario Based on the Internet of Vehicle Paradigm: A Literature and Commercial Solutions Overview. Sensors 2025, 25, 562. [Google Scholar] [CrossRef]
- Yen, M.-H.; Tian, S.-L.; Lin, Y.-T.; Yang, C.-W.; Chen, C.-C. Combining a Universal OBD-II Module with Deep Learning to Develop an Eco-Driving Analysis System. Appl. Sci. 2021, 11, 4481. [Google Scholar] [CrossRef]
- Jain, M.; Vasdev, D.; Pal, K.; Sharma, V. Systematic literature review on predictive maintenance of vehicles and diagnosis of vehicle’s health using machine learning techniques. Comput. Intell. 2022, 38, 1990–2008. [Google Scholar] [CrossRef]
- Mahale, Y.; Kolhar, S.; More, A.S. A comprehensive review on artificial intelligence driven predictive maintenance in vehicles: Technologies, challenges and future research directions. Discov. Appl. Sci. 2025, 7, 243. [Google Scholar] [CrossRef]
- Jain, N.; Mittal, S. Review of computational techniques for modelling eco-safe driving behavior. Int. J. Automot. Mech. Eng. 2023, 20, 10422–10440. [Google Scholar] [CrossRef]
- Cao, Z.; Shi, K.; Qin, H.; Xu, Z.; Zhao, X.; Yin, J.; Jia, Z.; Zhang, Y.; Liu, H.; Zhan, Q.; et al. A comprehensive OBD data analysis framework: Identification and factor analysis of high-emission heavy-duty vehicles. Environ. Pollut. 2025, 368, 125751. [Google Scholar] [CrossRef]
- Shirole, V.; Shahade, A.K.; Deshmukh, P.V. A comprehensive review on data-driven driver behaviour scoring in vehicles: Technologies, challenges and future directions. Discov. Artif. Intell. 2025, 5, 26. [Google Scholar] [CrossRef]
- Rocha, D.; Teixeira, G.; Vieira, E.; Almeida, J.; Ferreira, J. A Modular In-Vehicle C-ITS Architecture for Sensor Data Collection, Vehicular Communications and Cloud Connectivity. Sensors 2023, 23, 1724. [Google Scholar] [CrossRef] [PubMed]
- Gesteira-Miñarro, R.; López, G.; Palacios, R. Revisiting Wireless Cyberattacks on Vehicles. Sensors 2025, 25, 2605. [Google Scholar] [CrossRef] [PubMed]
- Du, Z.; Li, H.; Chen, S.; Zhang, X.; Zhang, L.; Liu, Y. Advancements in machine learning for spatiotemporal urban on-road traffic-air quality study: A review. Atmos. Environ. 2025, 346, 121054. [Google Scholar] [CrossRef]
- Malik, M.; Nandal, R. A framework on driving behavior and pattern using On-Board diagnostics (OBD-II) tool. Mater. Today Proc. 2023, 80, 3762–3768. [Google Scholar] [CrossRef]
- McCord, K. Automotive Diagnostic Systems: Understanding OBD I and OBD II; CarTech Inc.: North Branch, MN, USA, 2011. [Google Scholar]
- European Parliament and Council. Regulation (EC) No 715/2007 of 20 June 2007 on type approval of motor vehicles with respect to emissions from light passenger and commercial vehicles (Euro 5 and Euro 6) and on access to vehicle repair and maintenance information. Off. J. Eur. Union 2007, L 171, 1–16. Available online: https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX%3A32007R0715 (accessed on 30 April 2025).
- European Parliament and Council. Regulation (EU) 2018/858 of 30 May 2018 on the approval and market surveillance of motor vehicles and their trailers, and of systems, components and separate technical units intended for such vehicles, amending Regulations (EC) No 715/2007 and (EC) No 595/2009 and repealing Directive 2007/46/EC. Off. J. Eur. Union 2018, L 151, 1–218. Available online: https://eur-lex.europa.eu/eli/reg/2018/858/oj (accessed on 30 April 2025).
- European Commission. Proposal for a Regulation of the European Parliament and of the Council on Type-Approval of Motor Vehicles and Engines with Respect to Their Emissions and Battery Durability (Euro 7). COM(2022) 586 Final. 2023. Available online: https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX%3A52022PC0586 (accessed on 30 April 2025).
- European Parliament and Council. Regulation (EU) 2019/631 of 17 April 2019 on setting CO2 emission performance standards for new passenger cars and for new light commercial vehicles. Off. J. Eur. Union 2019, L 111, 13–53. Available online: https://eur-lex.europa.eu/legal-content/EN/TXT/PDF/?uri=CELEX:32019R0631 (accessed on 30 April 2025).
- International Organization for Standardization. ISO 15031-5:2021—Road Vehicles—Communication Between Vehicle and External Equipment for Emissions-Related Diagnostics—Part 5: Emissions-Related Diagnostic Services. 2021. Available online: https://www.iso.org/standard/80771.html (accessed on 30 April 2025).
- Society of Automotive Engineers. SAE J1979: E/E Diagnostic Test Modes. SAE International. 2002. Available online: https://www.sae.org/standards/content/j1979_201702/ (accessed on 30 April 2025).
- International Organization for Standardization. ISO 15765-4:2021—Road Vehicles—Diagnostic Communication over Controller Area Network (DoCAN)—Part 4: Requirements for Emissions-Related Systems. 2021. Available online: https://www.iso.org/standard/78384.html (accessed on 30 April 2025).
- International Organization for Standardization. ISO 11898-1:2015—Road Vehicles—Controller Area Network (CAN)—Part 1: Data Link Layer and Physical Signalling. 2016. Available online: https://www.iso.org/standard/63648.html (accessed on 30 April 2025).
- International Organization for Standardization. ISO 14229-1:2020—Road Vehicles—Unified Diagnostic Services (UDS)—Part 1: Application Layer. 2020. Available online: https://www.iso.org/standard/72439.html (accessed on 30 April 2025).
- Subke, P.; Heineman, L.; Mayer, J. Diagnostic Communication with Zero Emission Vehicles (ZEV) Using ISO 14229-5 (UDS on IP) and SAE J1979-3 (ZEV on UDS); SAE Technical Paper; SAE International: Warrendale, PA, USA, 2024. [Google Scholar] [CrossRef]
- Refat, R.U.D.; Elkhail, A.A.; Malik, H. Machine learning for automotive cybersecurity: Challenges, opportunities and future directions. In AI-Enabled Technologies for Autonomous and Connected Vehicles; LectureNotes in Intelligent Transportation and Infrastructure; Murphey, Y.L., Kolmanovsky, I., Watta, P., Eds.; Springer: Cham, Switzerland, 2023. [Google Scholar] [CrossRef]
- Abediasl, H.; Ansari, A.; Hosseini, V.; Koch, C.R.; Shahbakhti, M. Real-time vehicular fuel consumption estimation using machine learning and on-board diagnostics data. Proc. Inst. Mech. Eng. Part D J. Automob. Eng. 2024, 238, 3779–3793. [Google Scholar] [CrossRef]
- Fan, P.; Song, G.; Zhai, Z.; Wu, Y.; Yu, L. Fuel consumption estimation in heavy-duty trucks: Integrating vehicle weight into deep-learning frameworks. Transp. Res. Part D Transp. Environ. 2024, 130, 104157. [Google Scholar] [CrossRef]
- Hu, J.; Shen, X.; Wang, S.; Ma, P.; Liu, C.; Sui, X. Research on truck mass estimation based on long short-term memory network. Energy 2024, 307, 132729. [Google Scholar] [CrossRef]
- Kabir, R.; Remias, S.M.; Waddell, J.; Zhu, D. Time-Series fuel consumption prediction assessing delay impacts on energy using vehicular trajectory. Transp. Res. Part D Transp. Environ. 2023, 117, 103678. [Google Scholar] [CrossRef]
- Rykała, M.; Grzelak, M.; Rykała, Ł.; Voicu, D.; Stoica, R.-M. Modeling Vehicle Fuel Consumption Using a Low-Cost OBD-II Interface. Energies 2023, 16, 7266. [Google Scholar] [CrossRef]
- Eissa, M.A.; Chen, P. Machine Learning-based Electric Vehicle Battery State of Charge Prediction and Driving Range Estimation for Rural Applications. IFAC-PapersOnLine 2023, 56, 355–360. [Google Scholar] [CrossRef]
- ISO 15031; Road Vehicles—Communication between Vehicle and External Equipment for Emissions-Related Diagnostics. International Organization for Standardization: Geneva, Switzerland, 2014.
- ISO 27145; Road Vehicles—Implementation of World-Wide Harmonized On-Board Diagnostics (WWH-OBD) Communication Requirements. International Organization for Standardization: Geneva, Switzerland, 2012.
- SAE International. SAE J1939 Standards Collection; SAE International: Amsterdam, The Netherlands, 2018. [Google Scholar]
- Xu, Z.; Wang, R.; Pan, K.; Li, J.; Wu, Q. Two-Stream Networks for COPERT Correction Model with Time-Frequency Features Fusion. Atmosphere 2024, 14, 1766. [Google Scholar] [CrossRef]
- Ntziachristos, L.; Gkatzoflias, D.; Kouridis, C.; Samaras, Z. COPERT: A European road transport emission inventory model. In Information Technologies in Environmental Engineering, Proceedings of the 4th International ICSC Symposium, Thessaloniki, Greece, 28–29 May 2009; Perdicakis, A., Athanasiadis, I.N., Mitkas, P.A., Tzafestas, A.A., Eds.; Springer: Heidelberg/Berlin, Germany, 2009; pp. 491–504. [Google Scholar]
- Ge, Y.; Hou, P.; Lyu, T.; Lai, Y.; Su, S.; Luo, W.; He, M.; Xiao, L. Machine Learning-Aided Remote Monitoring of NOx Emissions from Heavy-Duty Diesel Vehicles Based on OBD Data Streams. Atmosphere 2023, 14, 651. [Google Scholar] [CrossRef]
- He, W.; Zheng, X.; Zhang, Y.; Han, Y. Study on Determination of Excessive Emissions of Heavy Diesel Trucks Based on OBD Data Repaired. Atmosphere 2022, 13, 924. [Google Scholar] [CrossRef]
- Li, W.; Dong, Z.; Miao, L.; Wu, G.; Deng, Z.; Zhao, J.; Huang, W. On-road evaluation and regulatory recommendations for NOx and particle number emissions of China VI heavy-duty diesel trucks: A case study in Shenzhen. Sci. Total Environ. 2024, 928, 172427. [Google Scholar] [CrossRef]
- Liu, C.; Pei, Y. A novel method for correcting dew point protection (DPP) data of heavy-duty diesel vehicles based on random forest. Eng. Appl. Artif. Intell. 2024, 136, 109026. [Google Scholar] [CrossRef]
- Xu, Z.; Wang, R.; Wang, R.; Xia, X. Mobile source emission model based on temporal features transfer. In Proceedings of the 2021 IEEE 5th CAA International Conference on Vehicular Control and Intelligence (CVCI), Tianjin, China, 29–31 October 2021. [Google Scholar] [CrossRef]
- Yang, L.; Ge, Y.; Lyu, L.; Tan, J.; Hai, L.; Wang, X.; Yin, H.; Wang, J. Enhancing vehicular emissions monitoring: A GA-GRU-based soft sensors approach for HDDVs. Environ. Res. 2024, 247, 118190. [Google Scholar] [CrossRef]
- Zhao, Z.; Cao, Y.; Xu, Z.; Kang, Y. A seq2seq learning method for microscopic emission estimation of on-road vehicles. Neural Comput. Appl. 2024, 36, 8565–8576. [Google Scholar] [CrossRef]
- Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; et al. Pytorch: An imperative style, high-performance deep learning library. Adv. Neural Inf. Process. Syst. 2019, 32, 8026–8037. [Google Scholar] [CrossRef]
- Andrade, M.; Medeiros, M.; Medeiros, T.; Azevedo, M.; Silva, M.; Costa, D.G.; Silva, I. On the Use of Biofuels for Cleaner Cities: Assessing Vehicular Pollution through Digital Twins and Machine Learning Algorithms. Sustainability 2024, 16, 708. [Google Scholar] [CrossRef]
- Madziel, M. Instantaneous CO2 emission modelling for a Euro 6 start-stop vehicle based on PEMS data and artificial intelligence methods. Environ. Sci. Pollut. Res. 2023, 31, 6944–6959. [Google Scholar] [CrossRef] [PubMed]
- Moon, S.; Lee, J.; Kim, H.J.; Park, S. Study on CO2 Emission Assessment of Heavy-Duty and Ultra-Heavy-Duty Vehicles using Machine Learning. Int. J. Automot. Technol. 2024, 25, 651–661. [Google Scholar] [CrossRef]
- Singh, M.; Dubey, R.K. Deep Learning Model Based CO2 Emissions Prediction Using Vehicle Telematics Sensors Data. IEEE Trans. Intell. Veh. 2023, 8, 768–777. [Google Scholar] [CrossRef]
- Li, T.; Lou, X.; Yang, Z.; Fan, C.; Gong, B.; Xie, G.; Zhang, J.; Wang, K.; Zhang, H.; Peng, Y. Clarifying the impact of engine operating parameters of heavy-duty diesel vehicles on NOx and CO2 emissions using multimodal fusion methods. Sci. Total Environ. 2024, 954, 176598. [Google Scholar] [CrossRef]
- Xie, B.; Li, T.; Liu, T.; Chen, H.; Li, H.; Li, Y. Exploring high-emission driving behaviors of heavy-duty diesel vehicles based on engine principles under different road grade levels. Sci. Total Environ. 2024, 951, 175443. [Google Scholar] [CrossRef] [PubMed]
- Wang, H.; Liu, Q.; Bai, B.; Wang, J.; Xiao, H.; Liu, H.; Lian, J.; Lin, Z.; He, D.; Yin, H. Exploring heavy-duty truck operational characteristics through On-Board Diagnostics (OBD) data. Res. Transp. Bus. Manag. 2024, 57, 101204. [Google Scholar] [CrossRef]
- Wang, X.; Qiu, Z.; Liu, Z. Urban road BC emissions of LDGVs: Machine learning models using OBD/PEMS data. Chemosphere 2024, 365, 143348. [Google Scholar] [CrossRef]
- Rivera-Campoverde, N.D.; Arenas-Ramirez, B.; Munoz Sanz, J.L.; Jimenez, E. GPS Data and Machine Learning Tools, a Practical and Cost-Effective Combination for Estimating Light Vehicle Emissions. Sensors 2024, 24, 2304. [Google Scholar] [CrossRef]
- Xie, H.; Zhang, Y.; He, Y.; You, K.; Fan, B.; Yu, D.; Lei, B.; Zhang, W. Parallel attention-based LSTM for building a prediction model of vehicle emissions using PEMS and OBD. Measurement 2021, 185, 110074. [Google Scholar] [CrossRef]
- Madziel, M. Modeling Exhaust Emissions in Older Vehicles in the Era of New Technologies. Energies 2024, 17, 4924. [Google Scholar] [CrossRef]
- Arzhmand, E.; Rashid, H.; Hosseini, F. Diminution of Pedestrian Accident on Crowded Urban Streets Using Content-Based Video Retrieval. In Proceedings of the ICoABCD, Denpasar, Indonesia, 13–15 November 2023; pp. 192–196. [Google Scholar] [CrossRef]
- Koley, S.; Mondal, S.; Ghosal, P. Smart Prediction of Severity in Vehicular Crashes: A Machine Learning Approach. In Proceedings of the IEEE CINE, Bhubaneswar, India, 1–3 December 2022; pp. 1–5. [Google Scholar] [CrossRef]
- Mahariba, A.J.; Uthra, A.R.; Rajan, G.B. An efficient automatic accident detection system using inertial measurement through machine learning techniques for PTWs. Expert Syst. Appl. 2022, 192, 116389. [Google Scholar] [CrossRef]
- Boubezoul, A.; Dufour, F.; Bouaziz, S.; Larnaudie, B.; Espié, S. Dataset on powered two wheelers fall and critical events detection. Data Brief 2019, 23, 103828. [Google Scholar] [CrossRef] [PubMed]
- Ahmad, K.; Ping, E.P.; Ab Aziz, N.A. Leveraging OBD II Time Series Data for Driver Drowsiness Detection: A Recurrent Neural Networks Approach. In Proceedings of the IEEE IICAIET, Kota Kinabalu, Malaysia, 26–28 August 2024; pp. 518–523. [Google Scholar] [CrossRef]
- Raghesh Kumar, I.P.; Varghese, R.; Nagesh, N.; Sasidharan, V.; Philip, A.O. Study on Incident Detection and Management of Vehicular OBD Data on Blockchain. In Proceedings of the 6th International Conference on Computing Methodologies and Communication (ICCMC), Erode, India, 29–31 March 2022; pp. 145–152. [Google Scholar] [CrossRef]
- Gace, I.; Vdovic, H.; Babic, J.; Podobnik, V. An eco-aware framework for AI-based analysis of contextually enriched automotive trip data. Energy Sources Part A 2023, 45, 1274–1292. [Google Scholar] [CrossRef]
- Lee, C.-H.; Yang, H.-C. A Privacy-Preserving Learning Method for Analyzing HEV Driver’s Driving Behaviors. IEEE Access 2023, 11, 76816–76826. [Google Scholar] [CrossRef]
- Song, Q.; Li, W.; Pan, Y.; Leng, A.; Zhu, R. Driving Style Recognition of Leading Vehicles based on Semi-supervised Gaussian Mixture Model. J. Phys. Conf. Ser. 2023, 2456, 012019. [Google Scholar] [CrossRef]
- Rimpas, D.; Papadakis, A. Driving Events Identification and Operational Parameters Correlation based on the Analysis of OBD-II Timeseries. In Proceedings of the 8th International Conference on Vehicle Technology and Intelligent Transport Systems—Volume 1: VEHITS; SciTePress: Setubal, Portugal, 2022; pp. 257–264, ISBN 978-989-758-573-9. [Google Scholar] [CrossRef]
- Divyasri, C.; Neelima, N.; Smitha, T. Machine Learning for Road Safety Enhancement Through In-Vehicle Sensor Analysis. In Proceedings of the 2024 International Conference on Electronics, Electrical Engineering and Information Communication Technology (ICEEICT), Trichirappalli, India, 24–26 July 2024; pp. 1–6. [Google Scholar] [CrossRef]
- Kwak, B.-I.; Woo, J.; Kim, H.K. Know Your Master: Driver Profiling-Based Anti-Theft Method. In Proceedings of the 2016 International Conference on Privacy, Security and Trust (PST), Auckland, New Zealand, 12–14 December 2016; pp. 211–218. [Google Scholar] [CrossRef]
- Kumar, R.; Jain, A. Driving Behaviour Analysis and Classification by Vehicle OBD Data Using Machine Learning. J. Supercomput. 2023, 79, 18800–18819. [Google Scholar] [CrossRef]
- Govers, W.; Yurtman, A.; Aslandere, T.; Eikelenberg, N.; Meert, W.; Davis, J. Time-Shifted Transformers for Driver Identification Using Vehicle Data. IEEE Trans. Intell. Transp. Syst. 2024, 25, 3767–3776. [Google Scholar] [CrossRef]
- Singh, A.; Tiwari, V.; Srinivasa, K.G. Driver Profiling and Identification Based on Time Series Analysis. Int. J. Intell. Transp. Syst. Res. 2024, 22, 363–373. [Google Scholar] [CrossRef]
- Khan, M.; Ali, M.; Haque, F.; Habib, M. A Machine Learning Approach for Driver Identification. Indones. J. Electr. Eng. Comput. Sci. 2023, 30, 276–288. [Google Scholar] [CrossRef]
- Manderna, A.; Kumar, S. Effective Long Short-Term Memory Based Driver Identification in ITS. In Proceedings of the 2022 International Conference on Intelligent Communication Technologies and Virtual Mobile Networks (ICICT), Ghaziabad, India, 20–22 July 2022; pp. 464–469. [Google Scholar] [CrossRef]
- Andrade, P.; Silva, M.; Medeiros, M.; Costa, D.G.; Silva, I. TEDA-RLS: A TinyML Incremental Learning Approach for Outlier Detection and Correction. IEEE Sens. J. 2024, 24, 38165–38173. [Google Scholar] [CrossRef]
- Dini, P.; Saponara, S. Design and Experimental Assessment of Real-Time Anomaly Detection Techniques for Automotive Cybersecurity. Sensors 2023, 23, 9231. [Google Scholar] [CrossRef] [PubMed]
- Malik, A.W.; Anwar, Z.; Rahman, A.U. A Novel Framework for Studying the Business Impact of Ransomware on Connected Vehicles. IEEE Internet Things J. 2023, 10, 8348–8356. [Google Scholar] [CrossRef]
- Aloqaily, A.; Abdallah, E.E.; AbuZaid, H.; Abdallah, A.E.; Al-hassan, M. Supervised Machine Learning for Real-Time Intrusion Attack Detection in Connected and Autonomous Vehicles: A Security Paradigm Shift. Informatics 2025, 12, 4. [Google Scholar] [CrossRef]
- El-Gayar, M.M.; Alrslani, F.A.F.; El-Sappagh, S. Smart Collaborative Intrusion Detection System for Securing Vehicular Networks Using Ensemble Machine Learning Model. Information 2024, 15, 583. [Google Scholar] [CrossRef]
- Sabapathy, A.; Biswas, A. Road surface classification using accelerometer and speed data: Evaluation of a convolutional neural network model. Neural Comput. Appl. 2023, 35, 14183–14194. [Google Scholar] [CrossRef]
- Walker, D.; Entine, L.; Kummer, S. Pavement Surface Evaluation and Rating. In Asphalt PASER Manual; Wisconsin Transportation Information Center: Madison, WI, USA, 2002. [Google Scholar]
- Xiao, Z.; Chen, Y.; Alazab, M.; Chen, H. Trajectory Data Acquisition via Private Car Positioning Based on Tightly-coupled GPS/OBD Integration in Urban Environments. IEEE Trans. Intell. Transp. Syst. 2022, 23, 9680–9691. [Google Scholar] [CrossRef]
- Flores Fernández, A.; Sánchez Morales, E.; Botsch, M.; Facchi, C.; García Higuera, A. Generation of Correction Data for Autonomous Driving by Means of Machine Learning and On-Board Diagnostics. Sensors 2023, 23, 159. [Google Scholar] [CrossRef] [PubMed]
- Wang, X.; Xu, Y.; Xu, Y.; Wang, Z.; Wu, Y. Intrusion Detection System for In-Vehicle CAN-FD Bus ID Based on GAN Model. IEEE Access 2024, 12, 82402–82412. [Google Scholar] [CrossRef]
- European Union. Directive 2002/58/EC: The European Parliament and of the Council of 12 July 2002 Concerning the Processing of Personal Data and the Protection of Privacy in the Electronic Communications Sector (Directive on Privacy and Electronic Communications). Off. J. Eur. Communities 2002, L201, 37–47. [Google Scholar]
- Zhao, C.; Zhao, S.; Zhao, M.; Chen, Z.; Gao, C.-Z.; Li, H.; Tan, Y.-A. Secure Multi-Party Computation: Theory, practice and applications. Inf. Sci. 2019, 476, 357–372. [Google Scholar] [CrossRef]
- Zhao, P.; Zhang, G.; Wan, S.; Liu, G.; Umer, T. A survey of local differential privacy for securing internet of vehicles. J. Supercomput. 2020, 76, 8391–8412. [Google Scholar] [CrossRef]
- Alqubaysi, T.; Asmari, A.F.A.; Alanazi, F.; Almutairi, A.; Armghan, A. Federated Learning-Based Predictive Traffic Management Using a Contained Privacy-Preserving Scheme for Autonomous Vehicles. Sensors 2025, 25, 1116. [Google Scholar] [CrossRef] [PubMed]
- Li, Y.; Liu, W.; Liu, Q.; Zheng, X.; Sun, K.; Huang, C. Complying with ISO 26262 and ISO/SAE 21434: A Safety and Security Co-Analysis Method for Intelligent Connected Vehicle. Sensors 2024, 24, 1848. [Google Scholar] [CrossRef]
- Maaloul, S.; Aniss, H.; Mendiboure, L.; Berbineau, M. Performance Analysis of Existing ITS Technologies: Evaluation and Coexistence. Sensors 2022, 22, 9570. [Google Scholar] [CrossRef]
- Kishawy, E.; Abd El-Hafez, M.T.; Yousri, R.; Mohamed, A.; Tolba, M.F. Federated learning system on autonomous vehicles for lane segmentation. Sci. Rep. 2024, 14, 25029. [Google Scholar] [CrossRef] [PubMed]
- Piromalis, D.; Kantaros, A. Digital Twins in the Automotive Industry: The Road toward Physical-Digital Convergence. Appl. Syst. Innov. 2022, 5, 65. [Google Scholar] [CrossRef]
- Gaspar, D.; Silva, P.; Silva, C. Explainable AI for Intrusion Detection Systems: LIME and SHAP Applicability on Multi-Layer Perceptron. IEEE Access 2024, 12, 30164–30175. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Michailidis, E.T.; Panagiotopoulou, A.; Papadakis, A. A Review of OBD-II-Based Machine Learning Applications for Sustainable, Efficient, Secure, and Safe Vehicle Driving. Sensors 2025, 25, 4057. https://doi.org/10.3390/s25134057
Michailidis ET, Panagiotopoulou A, Papadakis A. A Review of OBD-II-Based Machine Learning Applications for Sustainable, Efficient, Secure, and Safe Vehicle Driving. Sensors. 2025; 25(13):4057. https://doi.org/10.3390/s25134057
Chicago/Turabian StyleMichailidis, Emmanouel T., Antigoni Panagiotopoulou, and Andreas Papadakis. 2025. "A Review of OBD-II-Based Machine Learning Applications for Sustainable, Efficient, Secure, and Safe Vehicle Driving" Sensors 25, no. 13: 4057. https://doi.org/10.3390/s25134057
APA StyleMichailidis, E. T., Panagiotopoulou, A., & Papadakis, A. (2025). A Review of OBD-II-Based Machine Learning Applications for Sustainable, Efficient, Secure, and Safe Vehicle Driving. Sensors, 25(13), 4057. https://doi.org/10.3390/s25134057