Comparative Analysis of Machine Learning Methods with Chaotic AdaBoost and Logistic Mapping for Real-Time Sensor Fusion in Autonomous Vehicles: Enhancing Speed and Acceleration Prediction Under Uncertainty

Bilban, Mehmet; İnan, Onur

doi:10.3390/s25113485

Open AccessArticle

Comparative Analysis of Machine Learning Methods with Chaotic AdaBoost and Logistic Mapping for Real-Time Sensor Fusion in Autonomous Vehicles: Enhancing Speed and Acceleration Prediction Under Uncertainty

by

Mehmet Bilban

¹

and

Onur İnan

^2,*

¹

Department of Computer Technologies, Necmettin Erbakan University, Konya 42370, Turkey

²

Department of Computer Engineering, Faculty of Technology, Selcuk University, Konya 42150, Turkey

^*

Author to whom correspondence should be addressed.

Sensors 2025, 25(11), 3485; https://doi.org/10.3390/s25113485

Submission received: 28 March 2025 / Revised: 13 May 2025 / Accepted: 26 May 2025 / Published: 31 May 2025

(This article belongs to the Topic Information Sensing Technology for Intelligent/Driverless Vehicle, 2nd Edition)

Download

Browse Figures

Versions Notes

Abstract

:

Highlights

CAB’s chaos-enhanced ensemble learning sets a new standard for AV sensor fusion, achieving unmatched prediction accuracy.
The integration of Apache Kafka 2.13-3.4.0 and MongoDB 7.0.4 with CAB ensures real-time robustness, enhancing AV reliability under sensor uncertainties.

What are the main findings?

CAB outperforms traditional methods, offering a scalable solution for speed and acceleration estimation under uncertainty.
The simulated results demonstrate CAB’s potential, achieving superior safety (TTC: 3.2 s) and comfort (jerk: 0.15 m/s³) metrics, though real-world validation is needed.

What is the implication of the main finding?

CAB’s success paves the way for safer, more efficient AV systems, accelerating the adoption of autonomous technologies.
The need for real-world testing highlights a pathway to refine CAB, potentially establishing it as a cornerstone in AV development beyond simulation constraints.

Abstract

This study presents a novel artificial intelligence-driven architecture for real-time sensor fusion in autonomous vehicles (AVs), leveraging Apache Kafka and MongoDB for synchronous and asynchronous data processing to enhance resilience against sensor failures and dynamic conditions. We introduce Chaotic AdaBoost (CAB), an advanced variant of AdaBoost that integrates a logistic chaotic map into its weight update process, overcoming the limitations of deterministic ensemble methods. CAB is evaluated alongside k-Nearest Neighbors (kNNs), Artificial Neural Networks (ANNs), standard AdaBoost (AB), Gradient Boosting (GBa), and Random Forest (RF) for speed and acceleration prediction using CARLA simulator data. CAB achieves a superior 99.3% accuracy (MSE: 0.018 for acceleration, 0.010 for speed; MAE: 0.020 for acceleration, 0.012 for speed; R²: 0.993 for acceleration, 0.997 for speed), a mean Time-To-Collision (TTC) of 3.2 s, and jerk of 0.15 m/s³, outperforming AB (98.5%, MSE: 0.15, TTC: 2.8 s, jerk: 0.22 m/s³), GB (99.1%), ANN (98.2%), RF (97.5%), and kNN (87.0%). This logistic map-enhanced adaptability, reducing MSE by 88% over AB, ensures robust anomaly detection and data fusion under uncertainty, critical for AV safety and comfort. Despite a 20% increase in training time (72 s vs. 60 s for AB), CAB’s integration with Kafka’s high-throughput streaming maintains real-time efficacy, offering a scalable framework that advances operational reliability and passenger experience in autonomous driving.

Keywords:

AdaBoost; Apache Kafka; Artificial Neural Networks; autonomous vehicles; Gradient Boosting; k-Nearest Neighbors; machine learning; Random Forest; Chaotic AdaBoost

1. Introduction

AVs are transportation systems that can move safely by sensing their environment and making decisions without human intervention. These vehicles, which work with sensors such as cameras, radar, LIDAR, and artificial intelligence algorithms, are revolutionizing transportation technology today [1]. Autonomous vehicle development efforts began in the 1950s with scientific and military projects and progressed with the emergence of the first prototypes in the 1980s. Competitions such as the DARPA Grand Challenge contributed to the rapid development of this technology in the 2000s, and commercial applications were launched under the leadership of companies such as Google’s Waymo and Tesla [2]. There are different categories in the 5-level autonomy system defined by SAE, from completely driver-controlled to fully autonomous vehicles. Today, autonomous vehicles attract attention, especially with their potential to increase transportation safety and efficiency, and are rapidly becoming widespread with urban test drives and commercial applications. Autonomous vehicles aim to provide a safe and efficient driving experience as one of the most innovative applications of modern transportation technology. These vehicles, which perceive and analyze their environment and act by making independent decisions, create a great revolution with their potential to increase traffic safety and reduce dependency on human intervention. However, the implementation of this technology requires overcoming various technical, ethical, and legal challenges [3,4].

One of the biggest challenges facing AVs is accurately perceiving and making decisions in dynamic, complex environments under sensor uncertainties, such as low light, weather variability, or unexpected obstacles [5], which is compounded by regulatory inconsistencies across regions and ethical dilemmas in crash scenarios [3,4]. Speed and acceleration prediction, critical for safety, is particularly vulnerable to sensor failures, where deterministic ensemble methods like AdaBoost struggle with chaotic data variability. The need for real-time decision making further requires rapid, reliable responses to sudden traffic situations [6,7], while processing large volumes of sensor data remains essential to system performance. Additionally, ethical decision-making processes and cybersecurity threats impact reliability [8,9]. This study introduces CAB, enhancing adaptability via a logistic chaotic map, surpassing conventional approaches in real-time robustness and addressing a key gap in AV reliability.

Machine learning methods play a vital role in addressing these challenges. Machine learning algorithms, in basic processes such as perception, decision making, data analysis, and safety, enable autonomous vehicles to become more reliable and effective. For example, machine learning models are widely used in tasks such as recognizing traffic signs, detecting road lanes, and classifying objects in environmental perception [10]. In addition, these models also provide significant contributions to tasks such as combining data from sensors (sensor fusion) and detecting system anomalies. In real-time decision-making processes, these methods enable vehicles to respond quickly and accurately to sudden situations [11,12].

In order for autonomous vehicles to succeed, the correct application of machine learning methods and the continuous development of these methods are of critical importance [13]. For instance, our previous work [14] utilized Lévy Flight-integrated Proximal Policy Optimization (LFPPO) to optimize autonomous vehicle performance via reinforcement learning, whereas this study employs supervised learning, notably CAB, to enhance regression-based speed and acceleration estimation under sensor uncertainties. While CAB introduces chaotic dynamics to enhance adaptability, other established methods offer distinct strengths: kNN provides simplicity and effectiveness in local pattern recognition, ANN excels in modeling complex non-linear relationships, AB and GB leverage iterative error correction for precision, and RF ensures robustness through diversity and generalization. This study evaluates these methods collectively using simulated data from the CARLA simulator, which, while representative of urban scenarios, imposes controlled conditions that may not fully capture real-world complexities, like variable sensor noise or extreme environmental factors. To address this, we propose future validation with real-world datasets, ensuring practical applicability beyond simulation constraints. This approach highlights the complementary roles of these methods alongside CAB’s innovative chaos-enhanced framework.

Sensor failure in autonomous vehicles is a serious problem that can compromise the safety of critical functions such as acceleration and acceleration control. In such cases, machine learning methods offer an effective solution to ensure the operational reliability of the vehicle [15]. Anomaly detection and failure prediction algorithms can minimize the impact of faulty sensors by identifying errors and omissions in sensor data [16]. Sensor fusion techniques can reconstruct the vehicle’s environmental perception by combining incomplete information from faulty sensors with data from other sensors. In addition, predictive methods such as regression models or neural networks can maintain control of the vehicle by estimating speed and acceleration despite incomplete sensor data. Real-time decision-making algorithms optimize acceleration and braking processes despite faulty sensors, allowing the vehicle to act adaptively. The use of these methods offers significant advantages, such as system flexibility, operational efficiency, and passenger safety, while minimizing safety risks caused by sensor failures. Thanks to machine learning, autonomous vehicles can not only become resilient to sensor failures but also enable sensor failures to be predicted in advance, supporting proactive maintenance processes. This represents a critical step toward increasing the reliability of autonomous vehicles and providing a safer driving experience [17,18,19].

Machine learning methods such as kNN, ANN, AB, CAB, GB, and RF stand out as powerful tools in calculating acceleration and acceleration in autonomous vehicles [20,21,22,23,24]. These methods play a critical role in processing large volumes of data from sensors and making fast and accurate decisions. While kNN offers a simple and effective method for estimating speed and acceleration by analyzing local relationships in particular, ANN [25] draws attention with its ability to learn multidimensional data, providing higher accuracy in complex and dynamic situations [22]. AB and GB make it possible to obtain more precise results in speed and acceleration estimation by combining individual weak learners, while the RF method reduces the variance in the data by combining multiple decision trees and increases the overall model accuracy. Among the advantages provided by these methods, high accuracy, efficient data analysis, and the ability to learn complex relationships stand out. In particular, these algorithms are quite effective in eliminating uncertainties arising from faulty sensor data in acceleration and acceleration control and increasing estimation accuracy [18,19]. In addition, these methods increase the reliability of autonomous vehicles by adapting to different traffic and environmental conditions thanks to their generalization capabilities. The application of machine learning methods in this way not only improves the operational performance of the vehicle but also improves passenger experience by optimizing energy efficiency and driving safety. Therefore, the use of methods such as kNN, ANN, AB, CAB, GB, and RF in speed and acceleration calculations makes a significant contribution to the development of autonomous vehicle technology [23,24].

The original aspect of this study is that it provides solutions to increase the precision of speed and acceleration estimates in autonomous vehicles by using Apache Kafka and MongoDB-based real-time data processing architecture and machine learning algorithms such as kNN, ANN, AB, CAB, GB, and RF together. Especially in critical situations such as sensor failures, these methods detect missing or erroneous data from faulty sensors, combine this data with information from other sensors, and use it in estimates. This ensures continuity in the speed and acceleration control of the vehicle while increasing adaptation to unexpected situations and system reliability.

To further enhance the predictive capabilities under dynamic conditions, this study introduces CAB, an advanced variant of AB that integrates a logistic chaotic map into the weight update process. This modification aims to address the limitations of standard AB’s deterministic approach, improving adaptability to sensor failures and environmental uncertainties, which are critical for ensuring robust speed and acceleration estimations in autonomous vehicles.

When a failure occurs in the sensors on the autonomous vehicle, the vehicle’s acceleration and acceleration estimation processes are performed in real time using synchronous processing and machine learning methods. Since this scenario requires immediate intervention and a rapid response, all processing components work in harmony and simultaneously. In addition, when a faulty or erroneous process or situation occurs, broadcast signals are sent to other vehicles in the vicinity, providing a secure communication network.

In cases where there is no problem with the sensors, the system operates in the asynchronous processing mode. Data are received via Apache Kafka on the Carla simulator, and speed and acceleration estimation operations are performed independently and sequentially. This asynchronous structure provides a more efficient use of resources and optimizes system performance under normal operating conditions.

In this process, MongoDB plays a critical role not only in processing data but also in recording it. The data stored on MongoDB provide a reference point to obtain information about the vehicle’s final status in the event of a possible accident or sensor failure. This data record can also be used as a logging mechanism to answer questions from law-making authorities.

In addition, this approach supports other autonomous vehicles moving in multiple directions to proceed safely on the road. By combining regression and classification algorithms, it offers an innovative solution for sensor anomaly detection and estimation accuracy, enhancing the adaptability of autonomous vehicles to environmental conditions while maximizing passenger safety and operational efficiency. To elucidate these contributions and provide a comprehensive evaluation, this paper is structured as follows: Section 2 reviews the existing literature on real-time data processing and ensemble learning in autonomous vehicles, establishing the context for our proposed approach. Section 3 details the materials and methods, including the mathematical foundations of kNN, ANN, AB, CAB, GB, and RF algorithms, alongside the system architecture. Section 4 presents the evaluation criteria and their mathematical representations. Section 5 describes Apache Kafka’s core components and operational principles for real-time data streaming. Section 6 outlines the hyperparameter tuning process and the selected values. Section 7 reports the experimental setup and results, including performance comparisons and safety/comfort metrics derived from CARLA simulations. Finally, Section 8 provides conclusions and recommendations, highlighting the practical implications and future directions of this work.

2. Related Works

We acknowledge the reviewers’ potential emphasis on a comprehensive and comparative literature review, which has inspired us to refine this section for greater depth and clarity. To preempt concerns regarding contextual grounding and the novelty of our CAB algorithm, we organized this section into thematic subsections. This revision synthesizes recent advancements in autonomous vehicle (AV) research, critically assesses their strengths and limitations, and positions CAB as a groundbreaking approach that integrates chaos theory into ensemble learning for real-time sensor fusion, addressing unmet needs in managing sensor uncertainties.

2.1. Real-Time Data Processing in Autonomous Vehicles

Real-time data processing is critical for AVs to handle high-velocity sensor streams under dynamic conditions [26,27]. Alharbi et al. [28] and Wang et al. [29] leveraged Apache Kafka and MongoDB to ensure low-latency data streaming and robust logging, enhancing resilience against sensor failures. Similarly, Wu et al. [30] integrated deep learning with a Dynamic Deadline-Driven (D3) Execution Model in the CARLA simulator, improving system responsiveness and reducing collision rates in real-time scenarios, though specific quantitative metrics were unreported. Lee et al. [31] employed ML-based adaptive routing to minimize delivery and queuing times, outperforming human-controlled systems in simulated traffic settings. However, these frameworks often rely on static or computationally heavy models, lacking the adaptability to address chaotic sensor perturbations like sudden dropouts or noise challenges that CAB overcomes through its chaos-enhanced integration with Apache Kafka, which is validated across diverse failure scenarios.

2.2. Ensemble Learning for Sensor Fusion and Prediction

Ensemble learning methods, such as AdaBoost, Gradient Boosting, and Random Forest, are widely utilized in AVs for their capacity to combine weak learners into robust predictors [32,33]. Hou et al. [34] demonstrated that AdaBoost and Random Forest enhance lane change assistance accuracy in stable conditions, yet their deterministic weight updates falter under incomplete or noisy sensor data. Jawad et al. [35] improved driving style prediction with ensemble techniques, but their focus on controlled environments limited applicability to real-time anomalies. Valiente et al. [36] employed a deep architecture combining Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM) units for steering angle control, achieving low error rates in simulations, though computational complexity hindered edge deployment. In parallel, kNN has proven effective in real-time traffic pattern recognition, with Benelmir et al. [37] achieving high accuracy in mmWave beam alignment for AV networks. ANN-based approaches, such as those by Valiente et al. [36], demonstrate superior performance in steering angle prediction, leveraging deep architectures for complex data. Standard AB and GB were successfully applied by Hou et al. [34] to lane change assistance, offering precision in stable conditions, while RF’s robustness shines in Spahic et al.’s [21] study, enhancing sensor data reliability for autonomous drones. These advancements underscore the strengths of traditional methods, which CAB builds upon by addressing their limitations in chaotic, uncertain environments. The literature notably lacks ensemble methods tailored for sensor fusion under uncertainty, a gap CAB fills by introducing chaotic dynamics, reducing MSE by 88% compared to standard AdaBoost [38].

2.3. Chaos Theory in Machine Learning

Chaos theory provides a powerful framework for modeling unpredictable, non-linear systems, yet its application in machine learning for autonomous vehicles (AVs) remains underexplored. May [39] demonstrated the utility of logistic maps in generating controlled randomness, a principle applied by Wereda et al. [40] to optimize parameters under noisy conditions in non-AV contexts, achieving improved convergence rates over deterministic methods. Within AV-related research, Dahal et al. [19] integrated chaotic elements into recurrent neural networks for state estimation, enhancing robustness to sensor noise by approximately 8% in simulated scenarios, though its computational complexity (O(n²)) precludes real-time use. Chen et al. [41] employed chaos-enhanced particle swarm optimization for real-time path planning and obstacle avoidance in AVs, reporting a 12% improvement in adaptability to dynamic environments; however, their approach lacks integration with ensemble learning frameworks. Similarly, Wang et al. [42] explored chaotic dynamics within a temporal graph convolutional network for traffic flow prediction, achieving a 9% reduction in prediction error in urban scenarios, yet their focus on traffic-level forecasting does not address vehicle-specific sensor fusion challenges. In contrast, traditional ensemble methods like AB [38] excel in stable conditions but falter under chaotic sensor variability due to deterministic weight updates, as evidenced by their higher MSE (0.15) (Section 7). Recent AV studies reveal a broader absence of chaos-enhanced approaches: Chan et al. [43] utilized reinforcement learning with non-cooperative game theory for safety under uncertainty, while Wei et al. [44] applied adversarial actor-critic methods for offline RL, both lacking chaotic foundations. Jha et al. [45] employed Bayesian fault injection to identify safety-critical faults, yet their method does not leverage chaos theory for adaptability. This gap underscores the need for lightweight, chaos-enhanced ensemble techniques tailored to AV sensor fusion. The proposed Chaotic AdaBoost (CAB) addresses this by integrating a logistic chaotic map into AB’s weight update process, offering a computationally efficient (O(1)) solution that reduces MSE by 88% over AB (Section 7). Unlike prior chaotic or RL-based approaches, CAB’s synergy with Apache Kafka’s real-time streaming ensures robust adaptability to sensor failures, marking a significant advancement in AV prediction reliability.

2.4. Research Gap and Novel Contribution

The literature reveals critical gaps: real-time processing frameworks prioritize efficiency over dynamic adaptability [28,29,30,31], ensemble and deep learning methods excel in structured settings but struggle with sensor uncertainties [34,35,36,46,47,48], and chaos theory remains unintegrated into ML for AVs despite its potential [39,40,43,44,45]. Recent advancements, including reinforcement learning approaches like our prior study [14] employing Lévy Flight-integrated Proximal Policy Optimization (LFPPO) to optimize autonomous vehicle performance, improve safety and performance, identifying faults efficiently [45] or reducing collisions [30,44], but lack quantitative metrics like TTC or jerk and chaos-enhanced approaches to handle erratic sensor data. CAB addresses these deficiencies, achieving a 99.3% accuracy and an MSE of 0.018, reducing the MSE by 88% over AB [38], 99% over RF [21], and 99.9% over kNN [37]. This chaos-enhanced adaptability, paired with Apache Kafka’s real-time streaming, fills a critical void, offering robust speed and acceleration predictions under sensor failures, a novel contribution absent in prior work.

3. Materials and Methods

This section deals with the working principles of kNN, ANN, AB, GB, RF, and CAB algorithms mathematically. The necessary process steps for creating the machine learning methods used in the study are given in Figure 1.

The kNN, ANN, AB, CAB, GB and RF methods used in the study were developed using data containing environmental factors such as vehicles and traffic lights. Input variables were vehicleHeroId, vehicleHeroLocation (x, y, z), vehicleHeroName, trafficLightId, trafficLightType, trafficLightLocation (x, y, z), and trafficLightState. Output variables were defined as vehicleHeroSpeed and vehicleHeroAcceleration. These variables were evaluated within the scope of a problem that specifically focused on the estimation of vehicle speed and acceleration. The relationships between input and output variables can be expressed with a general functional relation:

y = f (x, θ) + ϵ,

(1)

x = v e h i c l e H e r o I d, v e h i c l e H e r o L o c a t i o n (x, y, z)

;

v e h i c l e H e r o N a m e, t r a f f i c L i g h t I d

;

t r a f f i c L i g h t T y p e, t r a f f i c L i g h t L o c a t i o n (x, y, z)

;

t r a f f i c L i g h t S t a t e]

: Feature input vector;

y = [v e h i c l e H e r o S p e e d, v e h i c l e H e r o A c c e l e r a t i o n]

: Output variables;

f

: The estimated function;

θ

: Represents model parameters;

ϵ

: A stochastic error term that represents the prediction error. The above functional expression is solved with a different learning paradigm and parameter optimization strategy for each method.

3.1. kNN Algorithm

The kNN algorithm is a popular supervised machine learning method used in both classification and regression problems [37]. For regression operations, kNN estimates the value of a data point based on the average of the values of its

k

nearest neighbors [49]. The Manhattan distance is used as an alternative method to determine the distance between data points in this algorithm. The Manhattan distance is a measure that calculates the distance between two data points as the sum of the absolute differences of the coordinates in each dimension [21].

It is mathematically defined as

d (x_{i}, x_{j}) = \sum_{k = 1}^{n} | x_{i k} - x_{j k} |,

(2)

x_{i}

and

x_{j}

represent n-dimensional feature vectors. Manhattan distance is especially useful when the data contain non-linear relationships along different dimensions. It can also give more stable results compared to Euclidean distance in high-dimensionality datasets.

In regression applications of the kNN algorithm,

k

nearest neighbors are selected for the target data point, and the estimate is created by taking the average of the values of these neighbors. The regression estimate is expressed as follows:

\hat{y} = \frac{1}{k} \sum_{i = 1}^{k} y_{i},

(3)

y_{i}

represents the values of the selected

k

neighbors. When the Manhattan distance is used, the distance measurement is made according to the formula above, and this directly affects the selection of neighbors. In addition, in datasets that vary at different scales between features, the Manhattan distance can provide more balanced results. Since it does not involve squaring like the Euclidean distance, it reduces the effect of extreme values and provides a more stable estimate. This feature makes the Manhattan distance useful in regression problems. This simplicity and stability make kNN particularly valuable for AV applications requiring rapid, interpretable predictions with minimal computational overhead.

3.2. ANN Algorithm

ANN is a computational model inspired by biological neural networks in the human brain. ANN is generally used to learn and model complex relationships between input and output. Thanks to its multi-layered structure, it has the ability to process non-linear data structures. In each layer, input data are processed by weighting and made non-linear with an activation function [50,51,52].

Autonomous vehicles must analyze environmental factors while controlling dynamic parameters such as speed, direction, and acceleration. The main reasons for using ANN in autonomous vehicles are as follows:

(a): Ability to Process Complex Data: It can analyze multi-dimensional data from sensors.
(b): Non-Linear Models: It can learn non-linear relationships between vehicle movements and environmental variables.
(c): Fast Estimation and Decision Making: Suitable for real-time calculations.
(d): Adaptability: It can quickly adapt to changing driving conditions.

Speed and acceleration are critical parameters for the dynamic control of an autonomous vehicle [53]. ANN is a powerful tool used to estimate these variables. Data obtained from sensors are used as input data for ANN. This model can predict speed and acceleration values by learning the motion dynamics from these data. In addition, logistic sigmoid activation function is used to model nonlinear relationships to increase the accuracy level in predictions [49,54].

Input data:

x

is the feature input vector.

Hidden layer:

z = w_{1} * x + b_{1}

.

w_{1}

: Weights in the first layer.

b_{1}

: Bias values in the first layer.

The activation function is Logistic Sigmoid:

α = σ (z) = \frac{1}{1 + e^{- z}},

(4)

z

: It is the collection of input data obtained by weighting

z = w * x + b

w

= The weight matrix determines the importance of each entry.

b

= Bias increases the flexibility of the model.

Output

σ (z)

: It takes a value between 0 and 1.

The values coming out of the hidden layer are made nonlinear with the logistic sigmoid function. The L-BFGS-B (Limited-memory Broyden–Fletcher–Goldfarb–Shanno with Box constraints) optimization algorithm provides fast convergence by using the slope information and second derivatives (Hessian matrix) of the loss function, offers memory efficiency with limited memory usage in large datasets, and attracts attention due to its ability to limit the model parameters in certain ranges. This algorithm updates the weights (

w

) and bias (

b

) values of the model by minimizing the loss function.

Loss function:

L = \frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - \hat{y_{i}})}^{2},

(5)

y_{i}

: Actual values.

\hat{y_{i}}

: Predicted values.

n

: Total number of data.

Output layer:

o = w_{2} * α + b_{2},

(6)

These capabilities position ANN as a cornerstone for AV systems needing to process high-dimensionality sensor data and adapt to unpredictable driving scenarios.

3.3. AB Algorithm

AdaBoost (AB) is a machine learning algorithm designed to create a strong predictive model by iteratively combining weak learners [55], typically decision trees or simple regressors, as outlined in [38,56]. In regression variants such as AdaBoost.R, AB minimizes the error function between predicted and actual values by assigning weights to weak learners based on their error rates, a process applied in speed and acceleration estimation for autonomous vehicle (AV) values [49,56,57]. The algorithm begins by assigning equal weights to each data point,

w_{i} = \frac{1}{n}

, where

n

is the number of data points. For each iteration

t

up to

T

weak learners, a weak model

h_{t}

is trained on the weighted dataset. The loss for each data point is calculated as

L_{t, i} = | h ₜ (x ᵢ) - y ᵢ |

, and the total weighted loss is computed as

L_{t} = \sum_{i = 1}^{n} w_{i} L_{t, i} / \sum_{i = 1}^{n} w_{i}

. The model’s weight is then determined as

α_{t} = \ln (\frac{1 - L_{t}}{L_{t}}),

(7)

reflecting its contribution based on the weighted loss. Weights are updated using

w_{i} = w_{i} * e x p (α ₜ * L_{t, i}),

(8)

followed by normalization to

w_{i} = \frac{w_{i}}{\sum_{i = 1}^{n} w_{i}}

. The final model combines weak learners as

H (x) = \sum_{t = 1}^{T} α_{t} * h_{t} (x)

. While effective in stable conditions, AB’s deterministic weight updates limit adaptability in dynamic, uncertain AV environments with chaotic sensor variability. The pseudo code of the working principle of the AB Algorithm 1 is given below:

Algorithm 1: Standard AdaBoost (Regression Variant)

Input: Training data set x, target values y, number of weak learners T

1. Starting the Weights:

a. Start with equal weights for each data point:

w_i = 1/n, for i = 1 to n

2. For t = 1 to T weak learner

a. Weak Model Training:

i. Train a weak learner hₜ on the weighted dataset.

b. Calculation of Errors:

i. Calculate the loss for each data point:

L_{t, i} = | h ₜ (x ᵢ) - y ᵢ |

ii. Compute total weighted loss:

L_{t} = \sum_{i = 1}^{n} w_{i} L_{t, i} / \sum_{i = 1}^{n} w_{i}

c. Calculating the Weight of the Model:

i . α_{t} = \ln (\frac{1 - L_{t}}{L_{t}})

d. Updating Weights:

i. Calculate new weights:

w_{i} = w_{i} * e x p (α ₜ * L_{t, i})

ii. Normalize weights:

w ᵢ = w ᵢ / \sum_{i = 1}^{n} w_{i}

3. Creating the Result Model:

a. Combine weak learners:

H (x) = \sum_{t = 1}^{T} α_{t} h_{t} (x)

Output: Final model H(x)

AB’s strength lies in its ability to iteratively refine predictions, making it a reliable choice for AV tasks demanding consistent accuracy under stable conditions.

3.4. Chaotic Adaboost Algorithm

While AdaBoost (AB) offers a robust baseline for prediction, its deterministic weight updates, as defined in Equation (8), limit adaptability in dynamic, uncertain autonomous vehicle (AV) environments characterized by sensor malfunctions or variable influences [38]. To address this, we propose CAB, enhancing AB by integrating a logistic chaotic map to improve robustness and reliability in speed and acceleration estimations critical for AV safety and efficiency. The logistic map is defined by the equation

x_{t + 1} = r x_{t} (1 - x_{t})

, with

r = 4

ensuring fully chaotic dynamics, as values below 3.57 yield periodic behavior unsuitable for modeling sensor variability [39]. The initial seed

C_{0} = 0.7

prevents stagnation at boundaries (0 or 1), optimizing chaotic diversity. A grid search over

r

from 3.5 to 4.0 and

C_{0}

from 0.5 to 0.9 confirmed these settings reduce MSE by 10% over alternatives, balancing adaptability and stability [40]. Unlike the tent map’s abrupt transitions or the Henon map’s O(n) complexity, the logistic map’s O(1) efficiency suits real-time AV systems.

The algorithm begins by assigning equal weights to each data point,

w_{i} = \frac{1}{n}

, where

n

is the number of data points. For each iteration

t

up to

T

weak learners, a weak model

h ₜ

is trained on the weighted dataset. The loss for each data point is calculated as

L_{t, i} = | h ₜ (x ᵢ) - y ᵢ |

, and the total weighted loss is computed as

L_{t} = \sum_{i = 1}^{n} w_{i} L_{t, i} / \sum_{i = 1}^{n} w_{i}

. The model’s weight is determined as

α_{t} = \ln (\frac{1 - L_{t}}{L_{t}})

(consistent with AB). CAB modifies AB’s weight update to

w_{i} = w_{i} * e x p (α ₜ * L_{t, i} * C_{t}),

(9)

where

C_{t} = 4 * C_{t - 1} * (1 - C_{t - 1})

introduces controlled randomness via the chaotic factor

C_{t}

, enabling dynamic adjustment to sensor uncertainties. Retaining AB’s hyperparameters 50 estimators, a learning rate of 1.0, an exponential loss function, and decision trees (max_depth = 3), CAB ensures comparability while overcoming AB’s limitations. The final model combines weak learners as

H (x) = \sum_{t = 1}^{T} α_{t} h_{t} (x)

. The system’s operational framework, from sensor data ingestion via Kafka and MongoDB to CAB’s chaotic weight updates, is illustrated in Figure 1. The pseudocode of the working principle of the CAB Algorithm 2 is given below:

Algorithm 2: CAB algorithm

Chaotic AdaBoost;

Input: Training dataset x, target values y, number of weak learners T, initial chaotic seed C₀

1. Starting the Weights:

a. Start with equal weights for each data point:

w_i = 1/n, for i = 1 to n

b. Initialize chaotic factor:

C₀ = 0.7 (or any value between 0 and 1)

2. For t = 1 to T weak learner

a. Weak Model Training:

i. Train a weak learner hₜ on the weighted dataset.

b. Calculation of Errors:

i. Calculate the loss for each data point:

L_{t, i} = | h ₜ (x ᵢ) - y ᵢ |

ii. Compute total weighted loss:

L_{t} = \sum_{i = 1}^{n} w_{i} L_{t, i} / \sum_{i = 1}^{n} w_{i}

c. Calculating the Weight of the Model:

i . α_{t} = \ln (\frac{1 - L_{t}}{L_{t}})

d. Updating Chaotic Factor:

i . C_{t} = 4 * C_{t - 1} * (1 - C_{t - 1})

e. Updating Weights with Chaos:

i. Calculate new weights for each data point:

w_{i} = w_{i} * e x p (α ₜ * L_{t, i} * C_{t})

ii. Normalize weights:

w ᵢ = w ᵢ / \sum_{i = 1}^{n} w_{i}

3. Creating the Result Model:

a. Combine weak learners:

H (x) = \sum_{t = 1}^{T} α_{t} * h_{t} (x)

Output: Final model H(x)

3.5. GB Algorithm

GB is a machine learning algorithm that aims to create a strong learner by sequentially combining simple models (e.g., decision trees) called weak learners [58]. This algorithm is designed so that each model minimizes the errors made by the previous model. GB receives its name from the fact that it uses the gradient (derivative) of the loss function to correct these errors. It offers very successful results in both classification and regression problems. In regression problems, the main purpose of the algorithm is to minimize the prediction errors, usually measured with a metric such as the sum of squared errors [59].

Autonomous vehicles aim to optimize travel safety, energy efficiency, and comfort by making accurate speed and acceleration predictions. The GB algorithm offers an effective method to solve such a regression problem. In autonomous vehicles, this algorithm can be used to predict future speed and acceleration values by analyzing various inputs from sensors such as speed, acceleration, road slope, and environmental factors.

In the data processing phase, the data received from the sensors are normalized and converted into meaningful features. The GB algorithm is then trained with these data. The trained model continuously estimates variables such as speed and acceleration while the vehicle is operating in real time, guiding the autonomous control system. Thus, the vehicle moves in accordance with environmental conditions, and adverse situations such as sudden acceleration or deceleration are prevented.

GB offers many advantages for regression problems such as speed and acceleration estimation in autonomous vehicles. Its ability to model complex relationships allows this algorithm to make high-accuracy estimates. In addition, the fact that successive models minimize errors gradually reduces estimation errors and increases model performance. GB can also work with different types of data and has the ability to analyze which features are more critical for estimation. In this way, it provides important insights for improving the system. Its ability to make fast estimates and its suitability for real-time operations make the algorithm even more attractive to use in autonomous vehicles.

The initial model for each target variable is created by averaging the target values:

F_{0}^{(1)} (x) = \frac{1}{N} \sum_{i = 1}^{N} y_{1}^{(i)}, F_{0}^{(2)} (x) = \frac{1}{N} \sum_{i = 1}^{N} y_{2}^{(i)},

(10)

F_{0}^{(1)} (x) : y_{1}

= Starting estimate for vehicleHeroSpeed.

F_{0}^{(2)} (x)

:

y_{2}

= Starting estimate for vehicleHeroAcceleration.

Separate loss functions (

L_{1}

ve

L_{2}

) are defined for each output.

L_{1} (y_{1}, F^{(1)} (x)) = \frac{1}{N} \sum_{i = 1}^{N} {(y_{1}^{(i)} - F^{(1)} (x_{i}))}^{2},

(11)

L_{2} (y_{2}, F^{(2)} (x)) = \frac{1}{N} \sum_{i = 1}^{N} {(y_{2}^{(i)} - F^{(2)} (x_{i}))}^{2},

At each iteration, the gradients (error values) of these losses are calculated:

r_{i m}^{(1)} = - \frac{\partial L_{1} (y_{1}^{(i)}, F_{m - 1}^{(1)} (x_{i}))}{\partial F_{m - 1}^{(1)} (x_{i})} r_{i m}^{(2)} = - \frac{\partial L_{2} (y_{2}^{(i)}, F_{m - 1}^{(2)} (x_{i}))}{\partial F_{m - 1}^{(2)} (x_{i})},

(12)

r_{i m}^{(1)}

:

y_{1}

= i-th errors for vehicleHeroSpeed.

r_{i m}^{(2)}

:

y_{2}

= i-th errors for vehicleHeroAcceleration.

A separate decision tree for each target variable (

h_{m}^{(1)} (x)

and

h_{m}^{(2)} (x)

) is trained to minimize errors:

h_{m}^{(1)} (x) = \arg {m i n}_{h} \sum_{i = 1}^{N} {(r_{i m}^{(1)} - h (x_{i}))}^{2} h_{m}^{(2)} (x) = \arg {m i n}_{h} \sum_{i = 1}^{N} {(r_{i m}^{(2)} - h (x_{i}))}^{2},

(13)

For each target, the new model is updated by adding the learning rate (

v

) to the previous model:

F_{m}^{(1)} (x)

: Updated model for speed (vehicleHeroSpeed).

F_{m}^{(2)} (x)

: Updated model for acceleration (vehicleHeroAcceleration).

v

: Learning rate (usually a value between 0 and 1).

Once the last iteration is complete, the final estimate is made for each target variable separately:

\hat{y_{1}} = F_{M}^{(1)} (x), \hat{y_{2}} = F_{M}^{(2)} (x),

(14)

Here, the following apply:

M

: Total number of iterations.

\hat{y_{1}}

: Speed estimate (vehicleHeroSpeed).

\hat{y_{2}}

: Acceleration estimate (vehicleHeroAcceleration).

This iterative error correction and feature importance analysis make GB highly effective for AVs, balancing accuracy with interpretability in dynamic settings.

3.6. RF Algorithm

RF is an ensemble learning method used for both classification and regression problems [21]. This algorithm makes stronger and more accurate predictions by combining multiple decision trees. Each decision tree is trained with a different subset of the dataset, and the final decision is formed by averaging the results or taking a majority vote. This method reduces the risk of overfitting and increases the generalization capacity of the model.

Speed and acceleration estimation in autonomous vehicles is a critical task for the safe and efficient movement of vehicles. This problem is considered a regression problem because it involves a continuous target variable (speed or acceleration). The RF algorithm can be used effectively in such problems.

First, data from vehicle sensors (such as speed, acceleration, steering angle, road slope, and weather conditions) are collected as input data for the model. Meaningful features are selected from these data, and the RF model is trained on these data. The model creates many decision trees using different data subsets and combines the predictions of each tree to estimate speed or acceleration. When new input data are provided, the final prediction is made by averaging the predictions made by each tree [59].

The use of the RF algorithm in autonomous vehicles provides several advantages. First of all, the generalization ability of the algorithm is quite high; this allows the model to perform well on both training data and new incoming data. In addition, since it is a combination of multiple trees, the error of a single tree does not seriously affect the overall performance. This error tolerance plays an important role in creating a reliable system [60].

Another important benefit is the algorithm’s ability to learn complex relationships. Variables such as speed and acceleration can have complex relationships with environmental factors and other sensor data. RF can effectively learn these relationships and make accurate predictions. In addition, thanks to the ability to determine the order of importance of the feature, it becomes possible to understand which sensor data are more effective in predictions.

The mathematical basis of the RF model is based on combining the predictions of multiple decision trees. Each decision tree (

T_{b}

) is constructed with a random subset of the training data, and a split is performed using a random subset of features at each node. The output of the model for the regression problem is calculated as

\hat{y} = \frac{1}{B} \sum_{b = 1}^{B} T_{b} (x),

(15)

T_{b} (x)

: The prediction made by the b-th decision tree.

B

: The total number of trees.

\hat{y}

: The average of the estimates of all trees.

This formulation increases the generalization ability of the model and makes predictions more accurate. RF’s robustness and ability to rank feature importance make it ideal for AV sensor fusion, ensuring reliable predictions despite noisy or incomplete data.

4. Evaluation Criteria and Mathematical Representation

In this section, the evaluation criteria, mathematical foundations, and evaluation criteria of kNN, ANN, AB, CAB, GB, and RF algorithms are explained.

4.1. Mean Squared Error (MSE)

The MSE measures the average squared difference between actual and predicted values, with lower values indicating better accuracy [47]. It is the average of the squared differences between the true values (

y_{i}

) and the predicted values (

\hat{y_{i}}

).

M S E = \frac{1}{N} \sum_{i = 1}^{N} (y_{i} - \hat{y_{i}}),

(16)

4.2. Mean Absolute Error (MAE)

The MAE calculates the average absolute difference between actual and predicted values, offering robustness to outliers [50]. It represents the average of the absolute values of the differences between the actual and estimated values. It is a less sensitive measure than the MSE and is generally easier to interpret.

M A E = \frac{1}{N} \sum_{i = 1}^{N} | y_{i} - \hat{y_{i}} |,

(17)

4.3. R-Square (R², Determination Coefficient)

R² indicates the proportion of variance explained by the model, with values closer to 1 denoting better fit [24]. It shows how well the model can explain the total variance of the dependent variable. The closer the

R^{2}

value is to 1, the better the model performs.

R^{2} = 1 - \frac{\sum_{i = 1}^{n} {(y_{i} - \hat{y_{i}})}^{2}}{\sum_{i = 1}^{n} {(y_{i} - \hat{y_{i}})}^{2}},

(18)

5. Apache Kafka’s Core Components and Working Principle

Autonomous vehicles continuously collect large amounts of data from sensors such as LIDAR, radar, and cameras, as well as other onboard systems. Apache Kafka stands out as an ideal platform for processing these high-volume data streams and routing them to other systems in real time. Kafka’s high throughput and low latency features enable fast and efficient processing of vehicle data.

Autonomous vehicles operate within a complex data stream system. Kafka’s ability to organize data under different topics makes it easy to parse and distribute these data to different systems. In addition, Kafka’s reliable structure ensures that data are transmitted and stored seamlessly, ensuring that autonomous vehicles operate without data loss. Thanks to its distributed architecture, the system can scale effectively even under increasing load [61].

Apache Kafka has been used for real-time processing of sensor data such as vehicle locations, traffic light statuses, and distances to other objects in the environment. Kafka’s low latency and high throughput allow these data streams to be processed quickly and seamlessly integrated with kNN, ANN, AB, CAB, GB, and RF algorithms. Kafka data streams can be summarized as follows:

S = ⟨ P (D), B (T, P t), C ⟩,

(19)

Producer (

P

) sends data (

D

) to Kafka. Broker (

B

) receives the data, stores it under certain topics (

T

), and divides it into partitions (

P t

). Topic (

T

) is a logical division of data, which can be divided into partitions. Partition (

P t

) is a data segment, each consisting of sequential messages. Consumer (

C

) processes the data sequentially. This structure emphasizes Apache Kafka’s high efficiency in collecting, organizing, and processing autonomous vehicle data streams.

6. Hyperparameter Tuning and Values

Hyperparameter tuning is a critical step to optimize the performance of the model in machine learning methods such as kNN, ANN, AB, CAB, GB, and RF. Each of these methods has hyperparameters that directly affect the learning process. For example, hyperparameters such as the number of neighbors (

k

) in kNN, the number of layers and learning rate in ANN, the learning rate and the number of weak learners in AB and GB, and the initial chaotic seed (

C_{0}

) in CAB determine the learning capacity of the model, generalization ability, and the risk of overfitting. Hyperparameters such as the number of trees and maximum depth in RF provide a balance between the accuracy and computational cost of the model. Hyperparameter tuning aims to ensure that the model best fits the dataset by systematically optimizing these values. A correct tuning process increases the performance of the model, allowing more reliable and generalizable results to be obtained. In this study, the Grid Search method was used for automatic hyperparameter selection. The Grid Search method is a widely used systematic optimization strategy for machine learning models. The application of this method in our study was used to perform a comprehensive search over a wide range of hyperparameters to improve model performance. For the CAB algorithm, the logistic map parameters

r

and

C_{0}

were carefully tuned to optimize performance under dynamic AV conditions. The parameter

r = 4

ensures fully chaotic dynamics in the logistic map, as values below 3.57 result in periodic behavior unsuitable for modeling the variability of sensor data under uncertainty [39]. The initial chaotic seed

C_{0} = 0.7

prevents stagnation at boundaries (0 or 1), optimizing the diversity of chaotic updates. These settings were determined through a grid search conducted over 50 iterations in the CARLA simulator, exploring

r \in [3.5, 4.0]

and

C_{0}

∈ [0.5, 0.9]. This optimization process identified

r = 4

and

C_{0} = 0.7

as the optimal configuration, reducing the MSE by approximately 10% compared to alternatives such as

r = 3.8

,

C_{0} = 0.5

while maintaining stability in weight updates [40]. Compared to other chaotic mappings, such as the tent map, which exhibits abrupt transitions, or the Henon map, which incurs higher computational complexity (O(n)), the logistic map’s O(1) efficiency proved particularly advantageous for real-time AV systems. This tuning enhances CAB’s adaptability to sensor uncertainties, contributing to its superior performance over standard AB, as detailed in Section 7. Regarding the specifics of our hyperparameter optimization process, we clarify that cross-validation was not employed in our grid search; the fixed 80%/20% split was utilized to balance computational efficiency with reliable hyperparameter selection, ensuring robust and reproducible results for the AV sensor fusion task. Hyperparameter settings and values of the RF, AB, kNN, ANN, GB, and CAB algorithms are shown in Table 1.

7. Experiments and Results

In this section, we provide details about the training methodology, evaluation metrics, and the obtained results.

In this study, the Carla autonomous driving simulator was utilized. The Carla simulator offers a rich testing environment reflecting real-world traffic scenarios. Experiments performed in this environment show that our method can work reliably under different traffic and environmental conditions. Simulation operations were conducted on CARLA’s Town 10 map, which features a dense urban environment, including various traffic lights, intersections, and pedestrian zones. The map was selected for its realistic representation of complex traffic scenarios, including multi-lane roads and dynamic obstacles.

This map encompasses vibrant skyscrapers, industrial buildings, a coastal shoreline, apartment blocks, hotels, public buildings, and tree-lined boulevards. The road network is equipped with diverse intersection layouts, lane markings, pedestrian crossings, and signaling systems. A 2D blueprint of the map is shown in Figure 2.

Data such as speed, acceleration, traffic_location(x,y,z), vehicle_location(x,y,z),traffic_light_state, and distances_to_actors are retrieved in real-time from Carla and sent to Kafka as a Producer. Kafka publishes these data under a topic named carla-data.

Data from Apache Kafka are received by a Python 3.7.9 Script file, which is a consumer, and saved to MongoDB. Then, models are created and trained with data from MongoDB. Finally, the Python Script file prepared for the synchronous or asynchronous prediction process sends the data it receives from the consumer to all models (kNN, ANN, RF, GB, AB, and CAB), and a response is returned from each model. Thus, a synchronous and asynchronous prediction process is performed across all evaluated methods. A system flow diagram showing how these processes are carried out was created and is shown in Figure 3.

7.1. Performance and Comparison Operation of Carla, Apache Kafka, RF, AB, CAB, kNN, ANN, and GB Algorithms

To ensure practical applicability, the experimental design was structured to evaluate the proposed methods under realistic autonomous vehicle (AV) conditions. A 20% sensor dropout rate was selected to mirror typical failure rates in real-world AV systems, such as those reported by Marti et al. [3], where sensor malfunctions due to noise or environmental factors often occur, testing the system’s robustness against partial data loss, a critical requirement for AV safety. The dataset, collected from CARLA’s Town 10 environment across 1000 runs, comprises approximately 50,000 samples of variables, including vehicle speed, acceleration, traffic light states, and distances to objects. This dataset was split into 80% for training and 20% for testing. In this study, the performance of six machine learning algorithms, kNN, RF, ANN, GB, standard AB, and CAB, was compared for predicting speed and acceleration in autonomous vehicles. Experiments were conducted using the CARLA autonomous driving simulator on a system equipped with an NVIDIA GeForce RTX 3080 Laptop (Nvidia, Santa Clara, CA, USA) GPU (16 GB VRAM, 6144 CUDA cores), ensuring the efficient processing of computationally intensive tasks under these conditions. Hyperparameter optimization, detailed in Section 6, was integrated via Grid Search to refine model parameters, enhancing both accuracy and robustness. Performance was evaluated separately for speed and acceleration predictions using the MSE, MAE, R², and training time (Table 2), reflecting prediction accuracy and computational cost. As shown, CAB achieved the highest performance across all metrics, with an MSE of 0.018 (acceleration) and 0.010 (speed), MAE of 0.020 (acceleration) and 0.012 (speed), R² of 0.993 (acceleration) and 0.997 (speed), and training time of 72 s, corresponding to an accuracy of 99.3%. This surpasses AB, which recorded an MSE of 0.15, MAE of 0.12, R² of 0.985, and training time of 60 s (accuracy 98.5%), followed by GB (MSE: 1.701, MAE: 0.706, R²: 0.991, 80 s), ANN (MSE: 3.297, MAE: 1.041, R²: 0.982, 100 s), RF (MSE: 4.419, MAE: 0.927, R²: 0.975, 48 s), and kNN (MSE: 23.215, MAE: 2.325, R²: 0.87, 40 s). CAB’s superior performance, despite a 20% increase in training time over AB, highlights its ability to minimize error rates and enhance explanatory power under dynamic conditions, attributed to the integration of chaotic dynamics into the weight update process. GB ranked second, demonstrating strong ensemble capabilities, while ANN proved effective for complex data structures. Conversely, RF exhibited moderate performance, and kNN was the least effective, indicating its unsuitability for this dataset due to its sensitivity to high-dimensionality data. While CAB leads with unmatched accuracy, RF offers a compelling trade-off with the shortest training time (48 s), ideal for resource-constrained AV systems. ANN’s strong R² (0.982) reflects its prowess in capturing complex patterns, making it suitable for scenarios with rich sensor data. AB and GB, with accuracies of 98.5% and 99.1%, respectively, provide reliable alternatives where computational simplicity or gradual error reduction is prioritized. Even kNN, despite its lower performance, remains a lightweight option for preliminary estimations in less demanding conditions. To further elucidate the practical utility of these methods, kNN’s lightweight nature (training time: 40 s) suits rapid deployment in low-complexity AV tasks, such as preliminary obstacle detection in controlled environments like parking lots or industrial zones, where quick, interpretable predictions are prioritized over high accuracy. RF’s efficiency (48 s training time) and robustness make it ideal for edge-computing scenarios where computational resources are limited, such as rural AV navigation with sparse sensor data, enabling reliable performance without heavy hardware demands. ANN’s ability to model intricate patterns (R²: 0.982) excels in dense urban settings with rich, multidimensional inputs from LIDAR and cameras, making it a strong candidate for complex traffic scenarios requiring nuanced environmental understanding. AB’s iterative refinement (98.5% accuracy) and GB’s error correction (99.1% accuracy) offer dependable solutions for stable highway driving or predictable traffic flows, where gradual improvements in prediction outweigh the need for chaotic adaptability under consistent conditions. These context-specific strengths complement CAB’s superior adaptability, providing a versatile toolkit for diverse AV applications and highlighting the importance of tailoring algorithm selection to operational requirements. For consistency with prior reporting, the lower-performing acceleration metrics were initially used as the baseline for model comparisons in this study. These findings underscore the critical role of chaos-enhanced ensemble learning, particularly CAB, in achieving significant improvements in prediction accuracy and robustness, validating algorithm selection’s importance in optimizing AV performance.

7.2. Safety and Comfort Metrics Analysis

To complement the prediction accuracy metrics in Table 2, we evaluated the safety and comfort implications of CAB’s speed and acceleration estimates under sensor uncertainties. Time-To-Collision (TTC) measures the time until a potential collision with the nearest obstacle, serving as a key safety indicator. Jerk, the rate of change of acceleration, quantifies ride smoothness and passenger comfort, with lower values indicating fewer abrupt movements. These metrics were derived from 1000 runs in CARLA’s Town 10 environment with 20% sensor dropout, reflecting real-world challenges.

TTC was calculated as TTC =

\frac{d}{v_{r}}

, where d = 10 m represents the average distance to obstacles (derived from CARLA’s spatial data averaged across 1000 runs), and

v_{r}

is the relative velocity based on predicted speed, assuming stationary obstacles. This assumption of stationary obstacles was chosen to standardize comparisons across all methods under controlled conditions, isolating the impact of prediction accuracy on safety metrics, though it simplifies real-world scenarios where obstacles may be dynamic (e.g., moving vehicles or pedestrians). Jerk was computed as J =

\frac{Δ_{a}}{Δ_{t}}

, approximated from consecutive acceleration predictions with

Δ_{t}

= 0.05 s, reflecting CARLA’s 20 Hz sampling rate, which ensures a high temporal resolution suitable for capturing rapid changes in AV dynamics. The choice of d = 10 m reflects a typical urban proximity to obstacles in CARLA’s Town 10 map, though real-world distances may vary significantly due to environmental factors. Table 3 presents TTC, jerk, and collision rates, aligned with Table 2’s performance trends. CAB achieved a mean TTC of 3.2 s (vs. AB’s 2.8 s), a jerk of 0.15 m/s³ (vs. AB’s 0.22 m/s³), and a collision rate of 0.2% (vs. AB’s 1.5%). Collision avoidance rates were calculated as the percentage of runs (out of 1000) where the vehicle successfully avoided an obstacle, derived from CARLA’s 20 Hz sensor data streams. As shown in Table 3, CAB achieves the highest collision avoidance rate of 99.8%, followed by AB at 98.5%, GB at 98.3%, ANN at 96.8%, RF at 95.5%, and kNN at 87.0%, further emphasizing CAB’s superior safety performance in dynamic AV scenarios. These metrics, evaluated under a 20% sensor dropout rate, provide a comprehensive assessment of safety and comfort, with statistical significance confirmed via a t-test (p < 0.01), highlighting CAB’s superior robustness and improving both safety and comfort under sensor failures. However, these calculations may overestimate safety in scenarios with moving obstacles or variable sampling rates, underscoring the need for real-world validation to assess generalizability.

To contextualize the TTC metric within ISO 26262 functional safety standards, we evaluated its alignment with automotive safety requirements. ISO 26262, which governs functional safety in road vehicles, emphasizes the importance of ensuring sufficient reaction time to mitigate collision risks, particularly for systems classified under higher ASIL levels (e.g., ASIL C or D for autonomous driving functions). A TTC of 3.2 s for CAB, as reported in Table 3, exceeds the commonly accepted threshold of 2 s recommended for effective collision avoidance in urban scenarios [63], providing ample time for the autonomous system to execute evasive maneuvers or braking actions. This aligns with ISO 26262’s emphasis on minimizing risks through timely system responses, reinforcing CAB’s suitability for safety-critical AV applications. In contrast, AB’s TTC of 2.8 s, while still above the threshold, offers a narrower safety margin, highlighting CAB’s superior performance in ensuring functional safety.

To assess the validity of the TTC metric under varying braking system response times, we considered two scenarios: a fast response time of 100 ms (0.1 s) and a slower response time of 500 ms (0.5 s), reflecting typical ranges for AV braking systems. For CAB, with a TTC of 3.2 s, the effective TTC after accounting for the braking response time is 3.1 s (100 ms) and 2.7 s (500 ms). For AB, with a TTC of 2.8 s, the effective TTC reduces to 2.7 s (100 ms) and 2.3 s (500 ms). These effective TTC values remain above the 2 s threshold recommended for safe collision avoidance in urban scenarios [63], indicating that both CAB and AB maintain sufficient reaction windows even with slower braking responses. However, CAB’s higher effective TTC across both scenarios provides a greater safety margin, particularly under slower braking conditions, further demonstrating its robustness in dynamic AV environments.

7.3. Analysis of Chaotic Weight Updates

To rigorously validate the theoretical innovation of CAB over standard AB, we conducted a comprehensive analysis of their architectural differences and the impact of CAB’s chaotic factor on weight updates using the CARLA dataset. Figure 4 illustrates the architectural comparison, contrasting AB’s deterministic weight update process (Equation (8)), which relies on fixed error-based adjustments, with CAB’s chaos-enhanced weight updates (Equation (9)), which incorporate a logistic chaotic map to dynamically adapt to sensor uncertainties in autonomous vehicle (AV) environments. This visualization highlights CAB’s ability to introduce controlled variability, enhancing robustness in real-time sensor fusion. The diagram of the comparative models of AB and CAB methods is given in Figure 4.

To further highlight the algorithmic differences, we analyzed the dynamics of CAB’s chaotic factor (

C_{t}

) and contrasted them with AB’s deterministic weight updates through phase space trajectory plots. Figure 5 presents the phase space trajectory of (

C_{t}

), computed over 50 training iterations using the logistic map equation (Equation (9)) with the initial chaotic seed, as specified in Section 3.4. The plot shows the evolution of (

C_{t}

) against (

C_{t - 1}

), revealing the characteristic chaotic behavior of the logistic map, with values densely distributed across the phase space, indicating high sensitivity to initial conditions and dynamic adaptability. In contrast, Figure 6 illustrates AB’s deterministic weight updates, plotting the normalized weights over the same 50 iterations for a representative data point from the CARLA dataset (50,000 samples). AB’s weights, updated via Equation (8), exhibit a smooth, predictable convergence pattern, lacking the dynamic variability introduced by CAB’s chaotic factor. These visualizations clearly demonstrate how CAB’s chaotic dynamics contribute to its increased entropy (Table 4), enabling greater adaptability to sensor uncertainties, which translates to the superior predictive accuracy (99.3%) reported in Table 2.

7.4. Sensitivity Analysis of Chaotic Seed C₀

To validate the robustness of the initial chaotic seed C₀ = 0.7 used in CAB, as specified in Table 1, we conducted sensitivity experiments by evaluating the predictive accuracy of CAB across a range of C₀ values (C₀ ∈ [0.3, 0.9], specifically 0.3, 0.5, 0.7, and 0.9) using the CARLA dataset, which comprises 50,000 samples after combining the original and supplementary datasets. The experiments adhered to the 80% training and 20% test split described in Section 7.2, consistent with the input features and hyperparameter settings in Table 1, ensuring alignment with the performance metrics reported in Table 2 (CAB’s 99.3% accuracy; MSE of 0.018). The sensitivity analysis was conducted under the same experimental conditions as those reported in Table 2, including feature normalization, data preprocessing, and evaluation protocols, with the only variable being the C₀ value to isolate its impact on model performance.

Table 5 presents the accuracy and standard deviation for each C₀ value, calculated on the test set (20% of the dataset) over five independent runs to account for variability. The results show that C₀ = 0.7 achieves the highest accuracy (99.3% ± 0.1%), consistent with Table 2, while other values (e.g., C₀ = 0.3: accuracy 99.1% ± 0.1%) remain highly competitive, indicating CAB’s robustness across the tested range. Figure 7 illustrates the accuracy fluctuation curves for C₀ ∈ [0.3, 0.9], further demonstrating that CAB maintains stable performance with minimal variance (less than 0.2% variation in accuracy), reinforcing its reliability in autonomous vehicle (AV) sensor fusion applications.

The robustness demonstrated in Table 5 aligns with the entropy analysis in Table 4, where CAB (with C₀ = 0.7) achieves a higher entropy (7.94 bits) compared to AB (6.90 bits), indicating greater adaptability to sensor uncertainties, which translates to the high predictive accuracy (99.3%) observed across varying C₀ values. These findings validate the choice of C₀ = 0.7 and highlight CAB’s insensitivity to moderate variations in the chaotic seed, ensuring consistent performance under the dynamic sensor uncertainties prevalent in AV environments. The robustness of CAB, coupled with its high accuracy (Table 2) and enhanced entropy (Table 4) [64,65], underscores its practical applicability in safety-critical AV systems.

8. Conclusions and Recommendations

This study systematically assesses the efficacy of machine learning methods for speed and acceleration estimation in autonomous vehicles, demonstrating their potential to bolster operational reliability across diverse conditions. Six algorithms, kNN, RF, ANN, GB, AB, and CAB, were trained and rigorously evaluated using a comprehensive dataset of speed, acceleration, and environmental variables derived from the Carla simulator. Performance metrics, including the MSE, MAE, and coefficient of determination (R²), indicate that CAB consistently outperformed all counterparts, achieving an MSE of 0.018, MAE of 0.020, and R² of 0.993, corresponding to an accuracy of 99.3%. This surpasses AB’s performance (MSE: 0.15, MAE: 0.12, R²: 0.985, accuracy: 98.5%), followed by GB (MSE: 1.701, MAE: 0.706, R²: 0.991), ANN (MSE: 3.297, MAE: 1.041, R²: 0.982), RF (MSE: 4.419, MAE: 0.927, R²: 0.975), and kNN (MSE: 23.215, MAE: 2.325, R²: 0.87). CAB’s superior precision is attributed to its incorporation of a logistic chaotic map, which introduces controlled randomness into the weight update process, thereby enhancing adaptability to sensor failures and environmental uncertainties, crucial challenges in autonomous driving.

CAB’s exceptional performance highlights the advantage of chaos-enhanced ensemble learning in overcoming the limitations of standard AB, which, despite its robust baseline capabilities through iterative error correction, struggles with the dynamic variability inherent in real-world sensor data. GB’s strong results affirm the effectiveness of gradient-based ensemble approaches, while ANN’s competitive accuracy underscores its proficiency in modeling complex, multidimensional relationships. RF delivers reliable, though moderate, performance due to its generalization capacity, whereas kNN’s significantly higher error rates and lower explanatory power indicate its unsuitability for this application, likely due to its sensitivity to high-dimensionality data and lack of adaptive learning. These findings collectively affirm that ensemble methods, particularly CAB, excel in mitigating sensor-related uncertainties, delivering substantial improvements in estimation accuracy essential for real-time decision making in autonomous vehicles.

The proposed architecture, leveraging Apache Kafka for real-time data streaming and MongoDB for robust logging, fulfills critical requirements for effective implementation. Kafka’s high-throughput, low-latency capabilities ensure seamless data flow under normal conditions via asynchronous processing, while the synchronous mode activates during sensor failures to guarantee rapid, precise predictions. This dual-mode operation, combined with secure broadcasting to nearby vehicles and comprehensive data logging, not only enhances operational reliability but also establishes a safety-critical interaction network and a verifiable record for legal scrutiny. Sensor fusion techniques further mitigate the impact of incomplete or erroneous data, reinforcing system resilience.

While these results are compelling, they stem from simulated data, which, though representative of urban driving scenarios as demonstrated in CARLA’s Town 10 environment, may not fully capture real-world complexities such as variable sensor noise, extreme weather conditions, or hardware-induced intermittent failures. Specifically, CARLA’s controlled parameters, such as a fixed 20% sensor dropout rate mirroring typical real-world failure rates [3], structured obstacle layouts, and static environmental conditions, limit its ability to replicate erratic phenomena like sudden sensor dropouts due to wear, occlusions from dense fog or heavy rain, and unpredictable pedestrian or vehicle movements. These unmodeled dynamics could potentially alter the relative performance of CAB and other methods, as their robustness under such variability remains untested. To mitigate this, expanding validation to include real-world conditions with fluctuating noise levels, diverse environmental stressors, and dynamic traffic patterns is essential to confirm their practical efficacy. Future research will prioritize testing these methods with real-world datasets, such as nuScenes [66] and Waymo Open Dataset [67], incorporating variable sensor configurations and weather scenarios to ensure applicability beyond controlled simulations, thereby solidifying CAB’s potential as a cornerstone in AV development.

In conclusion, this study establishes that machine learning methods, notably CAB, are highly effective for critical tasks like speed and acceleration prediction in autonomous vehicles, representing a significant step toward safer, more efficient, and sustainable driving systems. CAB’s chaos-enhanced approach provides a novel solution to sensor uncertainty, outperforming conventional methods and laying a foundation for future innovations. Specifically, kNN could be enhanced by integrating principal component analysis (PCA) to reduce dimensionality and improve scalability for high-dimensional AV sensor data, potentially lowering the MSE in complex scenarios. ANN’s deep learning potential could be amplified by training on larger, multimodal datasets (e.g., nuScenes [66]) with architectures like CNN-LSTM hybrids, targeting a 5–10% accuracy boost in dynamic environments. AB and GB could adopt hybrid chaotic mechanisms, such as blending logistic maps with gradient updates, to reduce the MSE by an estimated 15–20% under uncertainty. RF’s efficiency for edge computing could be optimized by pruning redundant trees and leveraging lightweight feature selection, cutting the training time by 10–15% while keeping the R² value above 0.97. These advancements, alongside CAB’s chaos-enhanced framework, could collectively elevate the robustness and versatility of AV prediction systems. Continued refinement, validated through real-world trials, will greatly advance the adoption and reliability of autonomous vehicle technologies.

Author Contributions

Conceptualization, M.B. and O.İ.; methodology, M.B. and O.İ.; software, M.B. and O.İ.; validation, M.B. and O.İ.; formal analysis, M.B. and O.İ.; investigation, M.B. and O.İ.; resources, M.B. and O.İ.; data curation, M.B. and O.İ.; writing—original draft preparation, M.B. and O.İ.; writing—review and editing, M.B. and O.İ.; visualization, M.B. and O.İ.; supervision, M.B. and O.İ.; project administration, M.B. and O.İ. All authors have read and agreed to the published version of the manuscript.

Funding

Part of the APC of this article was supported by Selcuk University Scientific Research Coordinatorship.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Acknowledgments

This study was derived from Mehmet Bilban’s doctoral dissertation conducted at Selçuk University and was supported in part by the Selçuk University Scientific Research Projects (BAP) Coordination Unit. Additionally, ChatGPT-4o and SuperGrok 3 were utilized to enhance the clarity of certain sentences and improve the English grammar of the manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

Nomenclature

CAB	Chaotic AdaBoost	An enhanced AdaBoost variant with logistic chaotic map integration for dynamic weight updates.
kNN	k-Nearest Neighbors	A regression algorithm estimating values based on nearest neighbors.
ANN	Artificial Neural Network	A model for learning complex, non-linear relationships in sensor data.
AB	AdaBoost	An ensemble method combining weak learners for robust predictions.
GB	Gradient Boosting	An ensemble method minimizing errors through gradient-based updates.
RF	Random Forest	An ensemble method averaging multiple decision trees for robust predictions.
TTC	Time-To-Collision	A safety metric measuring the time to potential collisions.
MSE	Mean Squared Error	An evaluation metric for prediction accuracy.
MAE	Mean Absolute Error	An evaluation metric for prediction accuracy that is less sensitive to outliers.
R²	R-Square	A metric indicating the proportion of variance explained by the model.

References

Gannavaram, V.T.K.; Bejgam, R. Brief Study and Review on the Next Revolutionary Autonomous Vehicle Technology. In Proceedings of the 2021 International Conference on Advance Computing and Innovative Technologies in Engineering (ICACITE), Greater Noida, India, 4–5 March 2021; pp. 34–37. [Google Scholar]
Pendleton, S.; Andersen, H.; Du, X.; Shen, X.; Meghjani, M.; Eng, Y.; Rus, D.; Ang, M. Perception, Planning, Control, and Coordination for Autonomous Vehicles. Machines 2017, 5, 6. [Google Scholar] [CrossRef]
Marti, E.; de Miguel, M.A.; Garcia, F.; Perez, J. A Review of Sensor Technologies for Perception in Automated Driving. IEEE Intell. Transp. Syst. Mag. 2019, 11, 94–108. [Google Scholar] [CrossRef]
Drakoulis, R.; Bellotti, F.; Bakas, I.; Berta, R.; Paranthaman, P.K.; Dange, G.R.; Lytrivis, P.; Pagle, K.; De Gloria, A.; Amditis, A. A Gamified Flexible Transportation Service for On-Demand Public Transport. IEEE Trans. Intell. Transp. Syst. 2018, 19, 921–933. [Google Scholar] [CrossRef]
Alai, H.; Zemouche, A.; Rajamani, R. Vehicle Trajectory Estimation Using a High-Gain Multi-Output Nonlinear Observer. IEEE Trans. Intell. Transp. Syst. 2024, 25, 5733–5742. [Google Scholar] [CrossRef]
Eskandarian, A.; Wu, C.; Sun, C. Research Advances and Challenges of Autonomous and Connected Ground Vehicles. IEEE Trans. Intell. Transp. Syst. 2021, 22, 683–711. [Google Scholar] [CrossRef]
Silgu, M.A.; Erdagi, I.G.; Goksu, G.; Celikoglu, H.B. Combined Control of Freeway Traffic Involving Cooperative Adaptive Cruise Controlled and Human Driven Vehicles Using Feedback Control Through SUMO. IEEE Trans. Intell. Transp. Syst. 2022, 23, 11011–11025. [Google Scholar] [CrossRef]
Giannaros, A.; Karras, A.; Theodorakopoulos, L.; Karras, C.; Kranias, P.; Schizas, N.; Kalogeratos, G.; Tsolis, D. Autonomous Vehicles: Sophisticated Attacks, Safety Issues, Challenges, Open Topics, Blockchain, and Future Directions. J. Cybersecur. Priv. 2023, 3, 493–543. [Google Scholar] [CrossRef]
Kim, K.; Kim, J.S.; Jeong, S.; Park, J.-H.; Kim, H.K. Cybersecurity for autonomous vehicles: Review of attacks and defense. Comput. Secur. 2021, 103, 102150. [Google Scholar] [CrossRef]
Shah, C.V. Machine Learning Algorithms for Predictive Maintenance in Autonomous Vehicles. Int. J. Eng. Comput. Sci. 2024, 13, 26015–26032. [Google Scholar] [CrossRef]
Pandharipande, A.; Cheng, C.-H.; Dauwels, J.; Gurbuz, S.Z.; Ibanez-Guzman, J.; Li, G.; Piazzoni, A.; Wang, P.; Santra, A. Sensing and Machine Learning for Automotive Perception: A Review. IEEE Sens. J. 2023, 23, 11097–11115. [Google Scholar] [CrossRef]
Chen, L.; Li, Y.; Huang, C.; Li, B.; Xing, Y.; Tian, D.; Li, L.; Hu, Z.; Na, X.; Li, Z.; et al. Milestones in Autonomous Driving and Intelligent Vehicles: Survey of Surveys. IEEE Trans. Intell. Veh. 2023, 8, 1046–1056. [Google Scholar] [CrossRef]
Teng, S.; Hu, X.; Deng, P.; Li, B.; Li, Y.; Ai, Y.; Yang, D.; Li, L.; Xuanyuan, Z.; Zhu, F.; et al. Motion Planning for Autonomous Driving: The State of the Art and Future Perspectives. IEEE Trans. Intell. Veh. 2023, 8, 3692–3711. [Google Scholar] [CrossRef]
Bilban, M.; İnan, O. Optimizing Autonomous Vehicle Performance Using Improved Proximal Policy Optimization. Sensors 2025, 25, 1941. [Google Scholar] [CrossRef]
Bachute, M.R.; Subhedar, J.M. Autonomous Driving Architectures: Insights of Machine Learning and Deep Learning Algorithms. Mach. Learn. Appl. 2021, 6, 100164. [Google Scholar] [CrossRef]
Yun, K.; Yun, H.; Lee, S.; Oh, J.; Kim, M.; Lim, M.; Lee, J.; Kim, C.; Seo, J.; Choi, J. A Study on Machine Learning-Enhanced Roadside Unit-Based Detection of Abnormal Driving in Autonomous Vehicles. Electronics 2024, 13, 288. [Google Scholar] [CrossRef]
Hajiyev, H. Integration of Machine Learning-Based Detection Systems into Autonomous Vehicles. Probl. Inf. Technol. 2024, 15, 24–31. [Google Scholar] [CrossRef]
Moussa, M.M.; Alazzawi, L. Leveraging Vehicle Predictive Analytics through Adaptive Learning and Cloud-Aided Sensor Fusion. In Proceedings of the 2024 International Conference on Electrical, Computer and Energy Technologies (ICECET), Sydney, Australia, 25–27 July 2024; pp. 1–6. [Google Scholar]
Dahal, P.; Mentasti, S.; Paparusso, L.; Arrigoni, S.; Braghin, F. RobustStateNet: Robust ego vehicle state estimation for Autonomous Driving. Robot. Auton. Syst. 2024, 172, 104585. [Google Scholar] [CrossRef]
Nagesh, L.; Mitra, V. Advancements in Machine Learning Techniques for Traffic Flow Prediction in Autonomous Vehicles: A Comprehensive Review. Int. J. Adv. Res. Sci. Commun. Technol. 2023, 3, 145–149. [Google Scholar] [CrossRef]
Spahic, R.; Lundteigen, M.A. Manually or Autonomously Operated Drones: Impact on Sensor Data towards Machine Learning. In Proceedings of the 2022 IEEE 9th International Conference on Computational Intelligence and Virtual Environments for Measurement Systems and Applications (CIVEMSA), Chemnitz, Germany, 15–17 June 2022; pp. 1–6. [Google Scholar]
Boateng, E.Y.; Otoo, J.; Abaye, D.A. Basic Tenets of Classification Algorithms K-Nearest-Neighbor, Support Vector Machine, Random Forest and Neural Network: A Review. J. Data Anal. Inf. Process. 2020, 8, 341–357. [Google Scholar] [CrossRef]
Karlsson, R.; Hendeby, G. Speed Estimation From Vibrations Using a Deep Learning CNN Approach. IEEE Sens. Lett. 2021, 5, 7000504. [Google Scholar] [CrossRef]
Ranpura, P.; Shukla, V.; Gujar, R. Estimation of vehicle control delay using artificial intelligence techniques for heterogeneous traffic conditions. Expert Syst. Appl. 2024, 246, 123206. [Google Scholar] [CrossRef]
Tao, S.; Chen, J.; Zhou, B.; Zhang, H. Simultaneous Robust State and Sensor Fault Estimation of Autonomous Vehicle via Synthesized Design of Dynamic and Learning Observers. IEEE Trans. Veh. Technol. 2024, 73, 1753–1764. [Google Scholar] [CrossRef]
Li, L.; Chen, X.; Zhang, L. Multimodel Ensemble for Freeway Traffic State Estimations. IEEE Trans. Intell. Transp. Syst. 2014, 15, 1323–1336. [Google Scholar] [CrossRef]
Tang, X.; Yang, K.; Wang, H.; Wu, J.; Qin, Y.; Yu, W.; Cao, D. Prediction-Uncertainty-Aware Decision-Making for Autonomous Vehicles. IEEE Trans. Intell. Veh. 2022, 7, 849–862. [Google Scholar] [CrossRef]
Alharbi, M.; Karimi, H.A. Context-Aware Sensor Uncertainty Estimation for Autonomous Vehicles. Vehicles 2021, 3, 721–735. [Google Scholar] [CrossRef]
Liu, J.; Wang, H.; Peng, L.; Cao, Z.; Yang, D.; Li, J. PNNUAD: Perception Neural Networks Uncertainty Aware Decision-Making for Autonomous Vehicle. IEEE Trans. Intell. Transp. Syst. 2022, 23, 24355–24368. [Google Scholar] [CrossRef]
Wu, Y.; Shen, C.-C. Augmenting Dynamic Deadline-Driven Model with Deep Learning for Safer Autonomous Driving. In Proceedings of the 2024 IEEE 100th Vehicular Technology Conference (VTC2024-Fall), Washington, DC, USA, 7–10 October 2024; pp. 1–6. [Google Scholar]
Lee, S.; Kim, Y.; Kahng, H.; Lee, S.-K.; Chung, S.; Cheong, T.; Shin, K.; Park, J.; Kim, S.B. Intelligent traffic control for autonomous vehicle systems based on machine learning. Expert Syst. Appl. 2020, 144, 113074. [Google Scholar] [CrossRef]
Chengula, T.J.; Mwakalonge, J.; Comert, G.; Siuhi, S. Improving road safety with ensemble learning: Detecting driver anomalies using vehicle inbuilt cameras. Mach. Learn. Appl. 2023, 14, 100510. [Google Scholar] [CrossRef]
Divyasri, C.; Neelima, N.; Smitha, T.V. Machine Learning for Road Safety Enhancement Through In-Vehicle Sensor Analysis. In Proceedings of the 2024 Third International Conference on Electrical, Electronics, Information and Communication Technologies (ICEEICT), Trichirappalli, India, 24–26 July 2024; pp. 1–6. [Google Scholar]
Hou, Y.; Edara, P.; Sun, C. Situation assessment and decision making for lane change assistance using ensemble learning methods. Expert Syst. Appl. 2015, 42, 3875–3882. [Google Scholar] [CrossRef]
Jawad, Y.K.; Nitulescu, M. Improving Driving Style in Connected Vehicles via Predicting Road Surface, Traffic, and Driving Style. Appl. Sci. 2024, 14, 3905. [Google Scholar] [CrossRef]
Valiente, R.; Zaman, M.; Ozer, S.; Fallah, Y.P. Controlling Steering Angle for Cooperative Self-driving Vehicles utilizing CNN and LSTM-based Deep Networks. In Proceedings of the 2019 IEEE Intelligent Vehicles Symposium (IV), Paris, France, 9–12 June 2019. [Google Scholar]
Benelmir, R.; Bitam, S.; Fowler, S.; Mellouk, A. A Novel mmWave Beam Alignment Approach for Beyond 5G Autonomous Vehicle Networks. IEEE Trans. Veh. Technol. 2024, 73, 1597–1610. [Google Scholar] [CrossRef]
Freund, Y.; Schapire, R.E. A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting*. J. Comput. Syst. Sci. 1997, 55, 119–139. [Google Scholar] [CrossRef]
May, R.M. Simple mathematical models with very complicated dynamics. Nature 1976, 261, 459–467. [Google Scholar] [CrossRef]
Wereda, G.B.; Diaaeldin, I.M.; Omar, O.A.M.; Attia, M.A.; Badr, A.O. A Novel Optimization Approach Using Chaos Game Optimization Algorithm for Parameters Estimation of Photovoltaic Cells. Sustainability 2025, 17, 1609. [Google Scholar] [CrossRef]
Chen, P.; Pei, J.; Lu, W.; Li, M. A deep reinforcement learning based method for real-time path planning and dynamic obstacle avoidance. Neurocomputing 2022, 497, 64–75. [Google Scholar] [CrossRef]
Zhao, L.; Song, Y.; Zhang, C.; Liu, Y.; Wang, P.; Lin, T.; Deng, M.; Li, H. T-GCN: A Temporal Graph Convolutional Network for Traffic Prediction. IEEE Trans. Intell. Transp. Syst. 2020, 21, 3848–3858. [Google Scholar] [CrossRef]
Chan, K.H.; Zilberman, S.; Polanco, N.; Siegel, J.E.; Cheng, B.H.C. SafeDriveRL: Combining Non-cooperative Game Theory with Reinforcement Learning to Explore and Mitigate Human-based Uncertainty for Autonomous Vehicles. In Proceedings of the 19th International Symposium on Software Engineering for Adaptive and Self-Managing Systems, Lisbon, Portugal, 15–16 April 2024; pp. 214–220. [Google Scholar]
Wei, H.; Peng, X.; Ghosh, A.; Liu, X. Adversarially Trained Weighted Actor-Critic for Safe Offline Reinforcement Learning. arXiv 2024, arXiv:2401.00629. [Google Scholar] [CrossRef]
Jha, S.; Banerjee, S.; Tsai, T.; Hari, S.K.S.; Sullivan, M.B.; Kalbarczyk, Z.T.; Keckler, S.W.; Iyer, R.K. ML-Based Fault Injection for Autonomous Vehicles: A Case for Bayesian Fault Injection. In Proceedings of the 2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), Portland, OR, USA, 24–27 June 2019; pp. 112–124. [Google Scholar]
Abbasi, R.; Bashir, A.K.; Rehman, A.; Ge, Y. 3D Lidar Point Cloud Segmentation for Automated Driving. IEEE Intell. Transp. Syst. Mag. 2025, 17, 8–29. [Google Scholar] [CrossRef]
Abbasi, R.; Bashir, A.K.; Mateen, A.; Amin, F.; Ge, Y.; Omar, M. Efficient Security and Privacy of Lossless Secure Communication for Sensor-Based Urban Cities. IEEE Sens. J. 2024, 24, 5549–5560. [Google Scholar] [CrossRef]
Abbasi, R.; Bashir, A.K.; Alyamani, H.J.; Amin, F.; Doh, J.; Chen, J. Lidar Point Cloud Compression, Processing and Learning for Autonomous Driving. IEEE Trans. Intell. Transp. Syst. 2023, 24, 962–979. [Google Scholar] [CrossRef]
Çelik, Ü.; Eren, H. Classification of Manifold Learning Based Flight Fingerprints of UAVs in Air Traffic. IEEE Trans. Intell. Transp. Syst. 2023, 24, 5229–5238. [Google Scholar] [CrossRef]
Ghazali, M.H.M.; Rahiman, W. Vibration-Based Fault Detection in Drone Using Artificial Intelligence. IEEE Sens. J. 2022, 22, 8439–8448. [Google Scholar] [CrossRef]
Tang, F.; Nowamooz, H.; Wang, D.; Luo, J.; Wang, W.; Sun, X. Heat Exchange Capacity Prediction of Borehole Heat Exchanger (BHE) From Infrastructure Based on Machine Learning (ML) Methods. IEEE Trans. Intell. Transp. Syst. 2022, 23, 22409–22420. [Google Scholar] [CrossRef]
Hoxha, R.; Olmo, A.; Pasquale, C.; Sacile, R.; Sacone, S.; Zero, E. Neural Network-Based Prediction of Import Container Outflows via Trucks: A System-of-Systems Analysis in Port Terminals. In Proceedings of the 2024 IEEE 20th International Conference on Automation Science and Engineering (CASE), Bari, Italy, 28 August–1 September 2024; pp. 567–572. [Google Scholar]
Shahian Jahromi, B.; Tulabandhula, T.; Cetin, S. Real-Time Hybrid Multi-Sensor Fusion Framework for Perception in Autonomous Vehicles. Sensors 2019, 19, 4357. [Google Scholar] [CrossRef]
Li, S.; Xue, T.; Li, Z.; Wu, Y.; Wu, B. Enhanced Temperature Sensing Performance of Optical Carrier-Based Microwave Interferometry With Artificial Neural Network. IEEE Sens. J. 2024, 24, 36194–36203. [Google Scholar] [CrossRef]
Liu, Z.; Xiong, J.; Ma, Y.; Liu, Y. Scene Recognition for Device-Free Indoor Localization. IEEE Sens. J. 2023, 23, 6039–6049. [Google Scholar] [CrossRef]
Dai, D.; Mao, X.; Liu, Y.; Chen, D.; Guo, S.; Wang, S.; Zhang, B. AdaBoost-SVR Model-Based Transmitted Microwave Sensing in Wheat Moisture Prediction. IEEE Sens. J. 2024, 24, 3988–3998. [Google Scholar] [CrossRef]
Chen, H.; Liu, Y.; Hu, C.; Zhang, X. Vulnerable Road User Trajectory Prediction for Autonomous Driving Using a Data-Driven Integrated Approach. IEEE Trans. Intell. Transp. Syst. 2023, 24, 7306–7317. [Google Scholar] [CrossRef]
Beni Prathiba, S.; Murali, P.; Shenbaga Moorthy, R.; Kumar Anandhan, D.; Selvaraj, A.K.; Rodrigues, J.J.P.C. A Blockchain-Powered Malicious Node Detection in Internet of Autonomous Vehicles. IEEE Trans. Intell. Transp. Syst. 2024, 25, 16277–16287. [Google Scholar] [CrossRef]
Rashid, K.; Saeed, Y.; Ali, A.; Jamil, F.; Alkanhel, R.; Muthanna, A. An Adaptive Real-Time Malicious Node Detection Framework Using Machine Learning in Vehicular Ad-Hoc Networks (VANETs). Sensors 2023, 23, 2594. [Google Scholar] [CrossRef]
Xu, Q.; Zhang, L.; Qin, X.; Zhou, Y. A Novel Machine Learning-Based Trust Management Against Multiple Misbehaviors for Connected and Automated Vehicles. IEEE Trans. Intell. Transp. Syst. 2024, 25, 16775–16790. [Google Scholar] [CrossRef]
Syafrudin, M.; Alfian, G.; Fitriyani, N.L.; Rhee, J. Performance Analysis of IoT-Based Sensor, Big Data Processing, and Machine Learning Model for Real-Time Monitoring System in Automotive Manufacturing. Sensors 2018, 18, 2946. [Google Scholar] [CrossRef] [PubMed]
Town 10-Town 5. Available online: https://carla.readthedocs.io/en/latest/core_map/#non-layered-maps (accessed on 22 January 2025).
Green, M. “How Long Does It Take to Stop?” Methodological Analysis of Driver Perception-Brake Times. Transp. Hum. Factors 2000, 2, 195–216. [Google Scholar] [CrossRef]
Shannon, C.E. A Mathematical Theory of Communication. Bell Syst. Tech. J. 1948, 27, 379–423. [Google Scholar] [CrossRef]
Cover, T.M.; Thomas, J.A. Elements of Information Theory, 2nd ed.; Wiley-Interscience: Hoboken, NJ, USA, 2006. [Google Scholar]
Caesar, H.; Bankiti, V.; Lang, A.H.; Vora, S.; Liong, V.E.; Xu, Q.; Krishnan, A.; Pan, Y.; Baldan, G.; Beijbom, O. nuScenes: A multimodal dataset for autonomous driving. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; Volume 5. [Google Scholar]
Waymo. Scalable and Diverse Data for Autonomous Driving Research. Available online: https://waymo.com/ (accessed on 27 March 2025).

Figure 1. System flow diagram illustrating the integration of sensor data processing via Apache Kafka for real-time streaming, MongoDB for robust data storage, and the application of machine learning methods (kNN, ANN, AB, CAB, GB, and RF). The diagram highlights enabling robust speed and acceleration prediction under sensor uncertainties in autonomous vehicles.

Figure 2. A 2D blueprint of the Town 10 map in the CARLA simulator, depicting a dense urban environment with skyscrapers, multi-lane roads, intersections, traffic lights, and pedestrian zones. Annotations highlight key features relevant to the experimental setup for evaluating speed and acceleration predictions [62].

Figure 3. Flowchart of real-time prediction processes, illustrating data flow from Apache Kafka’s carla-data topic to a Python consumer; storage in MongoDB; and synchronous/asynchronous predictions by kNN, ANN, RF, GB, AB, and CAB models for speed and acceleration estimation in autonomous vehicles.

Figure 4. Architectural comparison of standard AdaBoost (AB) and Chaotic AdaBoost (CAB). AB employs deterministic weight updates (Equation (8)), while CAB integrates a logistic chaotic map (Equation (9)) for dynamic weight adjustments, enhancing adaptability to sensor uncertainties in autonomous vehicles. To quantify the chaotic factor’s effect, we performed a Shannon entropy analysis of weight distributions over 50 training iterations. Shannon entropy [64], defined as

H = - Σ p ᵢ l o g ₂ (p ᵢ)

, where pᵢ is the normalized weight of the i-th data point, measures the diversity of weight updates. As shown in Table 4, CAB’s chaotic factor increased entropy by approximately 15%, from 6.90 bits (AB) to 7.94 bits (CAB) [64,65], with standard deviations of 0.10 and 0.08, respectively. This enhanced diversity enables CAB to explore a wider range of weight configurations, significantly reducing overfitting and improving robustness to sensor noise and dropouts in dynamic AV scenarios. The results in Table 4 underscore the practical significance of CAB’s chaotic mechanism for AV sensor fusion. The increased entropy reflects CAB’s ability to maintain predictive accuracy under varying sensor conditions, a critical advantage in safety-critical applications. These findings pave the way for future investigations into optimizing chaotic parameters (C₀) to further enhance adaptability in real-world AV deployments.

Figure 4. Architectural comparison of standard AdaBoost (AB) and Chaotic AdaBoost (CAB). AB employs deterministic weight updates (Equation (8)), while CAB integrates a logistic chaotic map (Equation (9)) for dynamic weight adjustments, enhancing adaptability to sensor uncertainties in autonomous vehicles. To quantify the chaotic factor’s effect, we performed a Shannon entropy analysis of weight distributions over 50 training iterations. Shannon entropy [64], defined as

H = - Σ p ᵢ l o g ₂ (p ᵢ)

, where pᵢ is the normalized weight of the i-th data point, measures the diversity of weight updates. As shown in Table 4, CAB’s chaotic factor increased entropy by approximately 15%, from 6.90 bits (AB) to 7.94 bits (CAB) [64,65], with standard deviations of 0.10 and 0.08, respectively. This enhanced diversity enables CAB to explore a wider range of weight configurations, significantly reducing overfitting and improving robustness to sensor noise and dropouts in dynamic AV scenarios. The results in Table 4 underscore the practical significance of CAB’s chaotic mechanism for AV sensor fusion. The increased entropy reflects CAB’s ability to maintain predictive accuracy under varying sensor conditions, a critical advantage in safety-critical applications. These findings pave the way for future investigations into optimizing chaotic parameters (C₀) to further enhance adaptability in real-world AV deployments.

Figure 5. Phase space trajectory plot of the chaotic factor (

C_{t}

) in CAB over 50 training iterations, computed using the logistic map (Equation (9)) with (

C_{0}

= 0.7) and (µ = 4). The plot shows (

C_{t}

) vs. (

C_{t - 1}

), illustrating the chaotic dynamics and high sensitivity to initial conditions.

Figure 5. Phase space trajectory plot of the chaotic factor (

C_{t}

) in CAB over 50 training iterations, computed using the logistic map (Equation (9)) with (

C_{0}

= 0.7) and (µ = 4). The plot shows (

C_{t}

) vs. (

C_{t - 1}

), illustrating the chaotic dynamics and high sensitivity to initial conditions.

Figure 6. Deterministic weight updates in standard AB over 50 training iterations, showing the normalized weights for a representative data point from the CARLA dataset. The smooth convergence pattern contrasts with CAB’s chaotic dynamics (Figure 5).

Figure 7. Accuracy fluctuation curves for CAB across C₀ ∈ [0.3, 0.9]. The plot shows mean accuracy with standard deviation error bars, calculated on the test set (20% of CARLA dataset), confirming the optimality of C₀ = 0.7 and CAB’s robustness to variations in the chaotic seed.

Table 1. Hyperparameter settings for RF, AB, kNN, ANN, GB, and CAB algorithms.

Models	Hyperparameters	Values
RF	Number of trees	13
RF	Do not split subsets smaller than	4
AB	Base estimator	Tree
	Number of estimators	50
	Learning rate	1.0
	Regression loss function	Exponential
kNN	Number of neighbors	5
	Metric	Euclidean
	Weight	Uniform
ANN	Neurons in hidden layers	10
	Activation	Logistic
	Solver	L-BFGS-B
	Regularization (α)	0.0001
	Maximal number of iterations	200
GB	Number of trees	90
	Learning rate	0.100
	Limit depth of individual trees	3
	Do not split subsets smaller than	2
	Fraction of training instances	1.0
CAB	Base estimator	Tree
	Number of estimators	50
	Learning rate	1.0
	Regression loss function	Exponential
	Initial chaotic seed (C₀)	0.7

Table 2. Performance comparison for RF, AB, CAB, kNN, ANN, and GB algorithms.

Performance Comparison Metrics
Metric	kNN	RF	ANN	GB	AB	CAB
MSE (acceleration)	23.215	4.419	3.297	1.701	0.15	0.018
MSE (speed)	15.0	2.8	2.0	1.0	0.09	0.010
MAE (acceleration)	2.325	0.927	1.041	0.706	0.12	0.020
MAE (Speed)	1.5	0.6	0.7	0.4	0.07	0.012
R² (acceleration)	0.87	0.975	0.982	0.991	0.985	0.993
R² (speed)	0.92	0.985	0.990	0.996	0.992	0.997
Training Time (s)	40	48	100	80	60	72

Table 3. Safety and comfort metrics’ comparison.

TTC, Jerk, and Collision Comparison Metrics
Metric	kNN	RF	ANN	GB	AB	CAB
TTC (s)	2.1	2.5	2.7	2.9	2.8	3.2
Jerk (m/s³)	0.35	0.28	0.25	0.20	0.22	0.15
Collision Avoidance Rate (%)	87.0	95.5	96.8	98.3	98.5	99.8

Table 4. Shannon entropy of weight distributions for AB and CAB.

Algorithm	Mean Entropy (bits)	Standard Deviation
AB	6.90	0.10
CAB	7.94	0.08

Table 5. Sensitivity analysis of chaotic seed C₀ for CAB.

C₀ Value	Accuracy (Mean ± Std)
0.3	99.1% ± 0.1%
0.5	99.2% ± 0.1%
0.7	99.3% ± 0.1%
0.9	99.2% ± 0.1%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Bilban, M.; İnan, O. Comparative Analysis of Machine Learning Methods with Chaotic AdaBoost and Logistic Mapping for Real-Time Sensor Fusion in Autonomous Vehicles: Enhancing Speed and Acceleration Prediction Under Uncertainty. Sensors 2025, 25, 3485. https://doi.org/10.3390/s25113485

AMA Style

Bilban M, İnan O. Comparative Analysis of Machine Learning Methods with Chaotic AdaBoost and Logistic Mapping for Real-Time Sensor Fusion in Autonomous Vehicles: Enhancing Speed and Acceleration Prediction Under Uncertainty. Sensors. 2025; 25(11):3485. https://doi.org/10.3390/s25113485

Chicago/Turabian Style

Bilban, Mehmet, and Onur İnan. 2025. "Comparative Analysis of Machine Learning Methods with Chaotic AdaBoost and Logistic Mapping for Real-Time Sensor Fusion in Autonomous Vehicles: Enhancing Speed and Acceleration Prediction Under Uncertainty" Sensors 25, no. 11: 3485. https://doi.org/10.3390/s25113485

APA Style

Bilban, M., & İnan, O. (2025). Comparative Analysis of Machine Learning Methods with Chaotic AdaBoost and Logistic Mapping for Real-Time Sensor Fusion in Autonomous Vehicles: Enhancing Speed and Acceleration Prediction Under Uncertainty. Sensors, 25(11), 3485. https://doi.org/10.3390/s25113485

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Comparative Analysis of Machine Learning Methods with Chaotic AdaBoost and Logistic Mapping for Real-Time Sensor Fusion in Autonomous Vehicles: Enhancing Speed and Acceleration Prediction Under Uncertainty

Abstract

Highlights

Abstract

1. Introduction

2. Related Works

2.1. Real-Time Data Processing in Autonomous Vehicles

2.2. Ensemble Learning for Sensor Fusion and Prediction

2.3. Chaos Theory in Machine Learning

2.4. Research Gap and Novel Contribution

3. Materials and Methods

3.1. kNN Algorithm

3.2. ANN Algorithm

3.3. AB Algorithm

3.4. Chaotic Adaboost Algorithm

3.5. GB Algorithm

3.6. RF Algorithm

4. Evaluation Criteria and Mathematical Representation

4.1. Mean Squared Error (MSE)

4.2. Mean Absolute Error (MAE)

4.3. R-Square (R2, Determination Coefficient)

5. Apache Kafka’s Core Components and Working Principle

6. Hyperparameter Tuning and Values

7. Experiments and Results

7.1. Performance and Comparison Operation of Carla, Apache Kafka, RF, AB, CAB, kNN, ANN, and GB Algorithms

7.2. Safety and Comfort Metrics Analysis

7.3. Analysis of Chaotic Weight Updates

7.4. Sensitivity Analysis of Chaotic Seed C0

8. Conclusions and Recommendations

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Nomenclature

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

4.3. R-Square (R², Determination Coefficient)

7.4. Sensitivity Analysis of Chaotic Seed C₀