1. Introduction
Urban traffic noise pollution is an escalating concern in cities around the world, impacting both developed and developing nations. The World Health Organization (WHO) identifies environmental noise as a major public health issue, linking it to a range of adverse psychological and physiological effects [
1]. Among various sources, road traffic is the predominant contributor to urban noise exposure, making it a focal point for environmental health research and urban policy interventions [
2].
In high-income countries, decades of research have led to the adoption of comprehensive noise mitigation measures, including sound barriers, vehicle emission and noise regulations, and intelligent traffic management systems. For instance, the United Kingdom has implemented noise abatement policies and compensation schemes to address excessive traffic noise near residential areas [
3]. Likewise, the International Civil Aviation Organization (ICAO) has successfully enforced global standards to reduce aircraft noise around airports [
4]. However, these interventions are not always transferable to developing countries, where differing climatic conditions, urban planning practices, regulatory enforcement capacities, and vehicle fleets complicate the direct application of such solutions.
Kuwait exemplifies a rapidly developing nation marked by accelerated urbanization, increasing private vehicle reliance, and escalating traffic congestion—all of which contribute significantly to rising urban noise levels. Unlike many cities in temperate climates, Kuwait’s hot desert environment promotes extensive use of air-conditioned vehicles and closed building envelopes, altering both exposure to and perception of environmental noise. Despite these distinctive characteristics, research on traffic noise in Kuwait remains scarce, and localized data to support evidence-based interventions is limited.
This study aims to address this gap by investigating traffic noise levels across different road types in Kuwait and identifying the factors contributing to noise pollution. By understanding these factors, the study seeks to provide insights into effective mitigation strategies tailored to the local context, thereby improving urban living conditions and informing future urban planning and policy development.
1.1. Background and Literature Review
A substantial body of research has established the adverse impacts of road traffic noise on human health and well-being. In the European Union, approximately 40% of the population is exposed to daytime noise levels exceeding 55 dB(A), and around 30% experience similar exposure during nighttime hours [
5]. These levels are associated with impaired concentration, disrupted sleep, and an elevated risk of cardiovascular and psychological disorders. For example, Pal and Bhattacharya [
6] reported increased incidences of headaches, irritability, and fatigue among workers exposed to high traffic noise, adversely affecting productivity. Likewise, studies by Stansfeld and Matheson [
7], Stansfeld et al. [
8], and Evans et al. [
9] demonstrated that chronic exposure to traffic and aircraft noise can impair children’s reading comprehension, attention span, and memory development, particularly in urban and educational settings.
In recent years, growing attention has been paid to traffic noise in developing countries, including those in the Gulf region. In India, for instance, urban noise levels frequently exceed international safety thresholds, highlighting the scale of the problem [
10]. Similarly, research in Kuwait has documented persistent traffic noise in both residential and commercial areas [
11]. Al-Mutairi et al. [
12] found that LAeq values across major urban corridors regularly exceeded standard outdoor limits. Their study, along with others, identified traffic volume, speed, and vehicle type as significant contributors to urban noise levels, highlighting a pattern of rising noise pollution in rapidly motorizing cities with limited enforcement of vehicle and infrastructure standards [
13,
14,
15]. This observation is further corroborated by Al-Masaeid, who emphasizes that vehicle types—particularly heavier vehicles—contribute uniquely to the noise profile along urban arterials [
13,
14].The interplay of traffic characteristics—volume, speed, and vehicle type—thus emerges as a multidimensional contributor to urban noise, underscoring the need for comprehensive predictive models and robust environmental impact assessments [
15,
16,
17].
The health implications of traffic noise in these regions are increasingly evident. In Kuwait, Al-Mutairi et al. [
11] reported elevated stress levels and coping behaviors, such as residents keeping windows shut even during pleasant weather to avoid exposure. Similarly, in Karachi, Mehdi et al. [
18] measured average LAeq levels of 66 dB(A), well above the WHO’s annoyance thresholds, with peak values reaching 101 dB(A), posing substantial risks of hearing damage. Regional study by Al-Ghonamy [
19] in Saudi Arabia further emphasizes the urgent need for regulatory reforms, including improved vehicle inspection systems and integrated urban noise management strategies.
Several studies have investigated engineering and policy-based mitigation strategies for urban traffic noise. Al-Mutairi et al. [
12] and Jamtahet et al. [
20] explored predictive modeling and physical interventions such as noise barriers and low-noise pavements. In Hong Kong, the Environmental Protection Department [
21] reported measurable success from retrofitting roads with noise-attenuating materials and barrier systems.
Complementing these efforts, machine learning (ML) has emerged as a powerful analytical tool in traffic noise research. Unlike traditional regression models, which are constrained by linearity and distributional assumptions [
22,
23], ML algorithms can model complex, nonlinear relationships among diverse variables, including traffic flow, environmental conditions, and temporal dynamics [
24,
25]. Algorithms such as Support Vector Machine (SVM), Random Forests, and Gradient Boosting have demonstrated superior performance in urban noise prediction, especially in heterogeneous and high-variability environments. Studies indicate that ensemble-based models consistently outperform classical approaches in capturing real-world traffic noise patterns [
26,
27].
More recently, advancements in interpretable machine learning have addressed key concerns surrounding the opacity of predictive models. Techniques such as SHapley Additive Explanations (SHAP) offer a systematic framework to quantify the contribution of each input feature to model predictions, providing both local (individual-level) and global (model-wide) insights [
28,
29,
30,
31]. SHAP has been successfully applied in environmental modeling to elucidate the roles of variables such as vehicle mix, time of day, and weather conditions in influencing noise levels [
31]. This interpretability not only enhances model transparency but also supports evidence-based policymaking aligned with principles of fairness, accountability, and usability in data-driven urban governance.
Despite the progress in this domain, much of the existing literature remains centered on high-income countries. In contrast, cities in developing regions—such as Kuwait—continue to face increasing traffic noise amid limited regulatory oversight and a lack of robust predictive tools. This study aims to bridge this gap by applying a machine learning-based framework, augmented with interpretable analytics, to identify the principal contributors to traffic noise across Kuwait’s urban corridors. The objective is to develop context-specific, data-driven recommendations for noise mitigation and urban planning, thereby enhancing environmental quality and public health outcomes.
1.2. Research Objectives and Contributions
This study aims to develop a comprehensive, data-driven framework for assessing urban traffic noise exposure in residential environments across Kuwait. The primary objective is to quantify equivalent continuous sound levels (LAeq) and identify the key determinants driving their variability, including traffic composition, vehicle speed, meteorological conditions, temporal patterns, and spatial context. By collecting high-resolution field data across various road types, time periods, and seasons, the research evaluates the extent to which observed traffic noise levels align with both national and international environmental standards.
A central goal of this study lies in advancing beyond traditional noise assessment approaches by integrating machine learning (ML) techniques for predictive modeling and applying explainable artificial intelligence (XAI) tools to interpret model outputs. Specifically, the study employs ensemble learning through Bagged Trees in conjunction with SHapley Additive Explanations (SHAP) to evaluate the relative influence of variables such as road classification, vehicle mix, traffic volumes, and weather conditions on urban noise levels. This approach offers transparent, feature-level insights into the drivers of traffic noise—addressing a key limitation of conventional black-box ML models.
In addition, the study considers local context—covering spatial differences, neighborhood types, and the growing share of modern vehicles built under stricter noise and emission standards. The aim is to produce a clear, locally grounded picture of traffic noise exposure, focusing especially on crowded and noise-sensitive residential areas.
This research offers several important contributions to the study of urban noise and environmental management:
It introduces an interpretable machine learning framework for predicting urban traffic noise, improving accuracy and reliability compared with conventional models.
It builds a detailed, multi-season noise dataset across different road types and neighborhoods in Kuwait, creating a solid reference for local studies and policy updates.
It identifies the main factors shaping noise—such as road type, vehicle mix, and time of day—offering insights for more targeted control measures.
The results can guide planners and transport officials in applying practical steps like traffic-calming, limiting heavy vehicles, and improving site layouts.
The framework also opens doors for future collaborations that connect noise monitoring, smart transport systems, and AI to support sustainable growth in Kuwait and similar cities.
2. Methodology
The methodology adopted in this study was structured into six sequential phases. First, the study area was defined to include 12 monitoring sites across Kuwait, comprising four highways, four major roads, and four collector roads distributed across four residential neighborhoods. Site selection was guided by traffic data, land use characteristics, and accessibility considerations. Preliminary visits were conducted to each location to assess site suitability and ensure that noise monitoring equipment could be safely and effectively installed.
In the second phase, detailed fieldwork procedures were developed, and the necessary official approvals were obtained to facilitate on-site data collection. This phase also included training sessions for field staff to ensure consistency and reliability in data acquisition.
The third phase involved direct noise level monitoring across the selected sites. Measurements were taken for all three road classifications—highways, major roads, and collector roads—under standardized conditions and using calibrated sound level meters to record equivalent continuous sound levels (LAeq) over designated time intervals.
Following data collection, machine learning models were employed to predict traffic noise levels and identify the most influential contributing factors, including traffic volume, vehicle composition, road classification, and environmental conditions. Four modeling techniques were applied: Linear Regression (LR), Support Vector Machines (SVM), Gaussian Process Regression (GPR), and Bagged Trees (BTE). These models were selected to capture both linear and nonlinear relationships among variables, with Bagged Trees serving as the ensemble learning method to enhance predictive performance in complex urban environments. To ensure model transparency and interpretability, SHapley Additive Explanations (SHAP) were applied, allowing for a detailed analysis of each feature’s contribution to the predicted noise levels. This explainable AI approach provides actionable insights and supports evidence-based strategies for traffic noise mitigation.
The fifth phase focused on the discussion of key findings, placing the results in the context of existing literature and local conditions. Finally, the study concluded with evidence-based recommendations for noise mitigation and urban planning, as well as suggestions for future research.
2.1. Study Area
This study was conducted in Kuwait, located in the northeastern Arabian Peninsula and has a population of approximately 4.43 million people [
27], with an estimated 2.3 million registered vehicles [
28]. In urban areas, the transportation sector is widely regarded as the primary source of environmental noise pollution. To conduct the noise monitoring component of this study, appropriate sites were identified based on the road network illustrated in
Figure 1. Four residential neighborhoods were selected to represent a diverse cross-section of urban environments in Kuwait: Ishbiliya, Al Farwaniyah, Zahra, and Mishref. As summarized in
Table 1, these neighborhoods were chosen to capture variations in land use, traffic intensity, and road infrastructure.
A field survey was conducted in each neighborhood to identify three representative road types for monitoring: a collector road, a major road, and a highway. This selection yielded a total of 12 monitoring sites across the four neighborhoods, as depicted in
Figure 1. These sites served as the basis for data collection to assess traffic noise levels under varying urban and roadway conditions.
2.1.1. Ishbiliya Neighborhood
Ishbiliya neighborhood is a residential area located near Kuwait International Airport. Three road types were monitored simultaneously: a collector road, a major road, and a highway. A four-lane collector road near a roundabout was selected; it includes speed humps in both directions and has a speed limit of 45 kph. The site is situated in a shallow street canyon aligned approximately parallel to the prevailing wind. The major road has three lanes per direction and a speed limit of 80 kph. The 6th Ring Road is a 12-lane highway with a posted speed limit of 120 kph.
2.1.2. Al Farwaniyah Neighborhood
Al Farwaniyah neighborhood is adjacent to Ishbiliya and also located near Kuwait International Airport. It is a residential area with some commercial activities and high-rise buildings. A collector road, major road, and highway were selected for monitoring. The collector road has four lanes and is bordered by commercial establishments and high-rise buildings, featuring a parking area and a U-turn. This site is considered a narrow street canyon roughly parallel to the prevailing wind. The speed limit for the collector road is 45 kph. A four-lane major road with an 80 kph speed limit was also monitored. Another location on the 6th Ring Road was selected, featuring five lanes in each direction and a speed limit of 120 kph.
2.1.3. Zahra Neighborhood
Zahra neighborhood is a residential area located near a major shopping mall. The collector road has two lanes in each direction, is adjacent to an intersection, and includes a speed bump; its speed limit is 45 kph. The major road consists of six lanes in total (three each way), with a posted speed limit of 80 kph. Along the 6th Ring Road, a ten-lane section with a speed limit of 120 kph was monitored.
2.1.4. Mishref Neighborhood
Mishref is a residential area near private colleges and universities. The collector road has two lanes in total, and the major roads have three lanes in each direction. The major road is considered a canyon street that runs approximately parallel to the prevailing wind. An eight-lane section of the 6th Ring Road was selected for monitoring. The speed limits for the collector, major roads, and highway are 45 kph, 80 kph, and 120 kph, respectively.
The main monitoring studies were conducted within the selected study area. Fixed-location measurements were simultaneously taken near collector roads, major roads, and highways to represent varying urban conditions. These operations were designed to capture the effects of road traffic noise pollution across different road classifications.
2.2. Instrumentation and Equipment Selection
The study relied on instruments chosen for both accuracy and durability, while staying within the project’s budget. Noise levels were measured using the Bruel & Kjaer 2250-L outdoor sound level meter, a portable device with a detection range of 21.5–140.8 dBA [
32]. Traffic volumes were obtained through video recordings captured by the Spack Solutions Countcam 2 camera [
33], which were later reviewed to extract vehicle counts. To measure vehicle speed, the Decatur Genesis GHD-KPH radar gun was used at randomly selected sites, offering a range of 20–337 kph [
34]. Weather conditions were monitored with the Ambient Weather WM-5 handheld meter, which records wind speed (0.64–143 kph), temperature (–15 °C to 50 °C), and relative humidity (0–99%) [
35].
2.3. Monitoring Operation
The monitoring operation involved several sites located in four urban neighbourhoods in Kuwait. The sites were selected to reflect traffic-related noise pollution in urban settings; two of them were near Kuwait International Airport and the other two were situated further away. Each site featured distinct land uses, ranging from residential to commercial areas. The monitoring operations were conducted over five days to represent one weekday of each of the four seasons and one weekend day per neighborhood. These monitoring operations took place between October 2021 and August 2022.
2.4. Monitoring Procedure
Fieldwork followed a standardized procedure across all sites. Traffic was recorded with a Countcam 2 camera fixed about 3 m above the road on lamp posts or pedestrian bridges. The footage was replayed later, and surveyors prepared minute-by-minute traffic counts by vehicle class and direction. Spot speeds were taken with a Genesis GHD-KPH radar gun, measured randomly on each road type and logged manually.
Noise levels were measured using a sound level meter positioned 2 m from the road edge and 1.5 m above ground. Calibration was carried out at the start of every session with a reference source and repeated at the end to confirm stability. Batteries and memory were checked daily. Weather conditions were tracked in parallel with a WM-5 handheld device, recording temperature, humidity, and wind speed every minute. Time alignment across devices was also verified.
We carried out monitoring in two daily periods, from 5:00–9:00 in the morning and 13:00–15:00 at midday. After each run, the data were collected and the same procedure was applied at every site to keep results comparable.
Figure 2 gives examples of the setups.
Noise Exceedance Rates
When benchmarked against the WHO daytime guideline of 55 dB(A), the recorded LAeq values were almost universally above the recommended limit. Across all measurements, 98.2% exceeded the threshold. Highways were unsurprisingly the most affected, with every observation above the limit, while collector and major roads still showed very high exceedance rates (98.8% and 95.7%, respectively). A similar trend was seen across time categories, with weekdays (95.7%) and weekends (98.8%) both showing consistently high exceedance levels, as shown in
Table 2.
This pattern highlights the scale of the noise problem in Kuwait’s urban road network and provides essential context for the predictive modeling presented later. The near-universal exceedances underline the urgency of incorporating noise considerations into national planning and regulatory frameworks, as traffic noise has become a persistent and widespread exposure in the studied areas.
2.5. Model Descriptions
This study employed four predictive modeling techniques to estimate traffic noise levels (LAeq) based on multivariate input data: Linear Regression, Support Vector Machine (SVM), Gaussian Process Regression (GPR), and Bagged Trees. These models were selected to offer a diverse and comprehensive comparison across different modeling paradigms—namely, parametric (Linear Regression), kernel-based (SVM), probabilistic (GPR), and ensemble-based (Bagged Trees) methods. This comparative framework enables a robust evaluation of model performance in capturing the nonlinear associations between traffic noise and contributing factors such as traffic volume, vehicle composition, road classification, and meteorological variables.
2.5.1. Linear Regression (LR)
Linear Regression is a classical statistical technique used to model the relationship between a dependent variable and one or more independent variables by fitting a linear equation to observed data. In this study, a multivariate linear regression model was implemented to evaluate how traffic, environmental, spatial, and temporal variables influence LAeq values. While linear regression is valued for its simplicity, transparency, and interpretability, it relies on assumptions such as linearity, independence of errors, and homoscedasticity. These assumptions may not fully capture the nonlinear complexities inherent in urban traffic noise patterns. Nonetheless, the model served as a reference benchmark to compare the performance of more advanced machine learning approaches.
2.5.2. Support Vector Machine (SVM)
Support Vector Machine regression, specifically Support Vector Regression (SVR), is a kernel-based machine learning method designed to predict continuous outcomes by finding a function that deviates from observed targets by no more than a predefined threshold (ε), while minimizing model complexity. By utilizing kernel functions—commonly radial basis function (RBF) or polynomial kernels—SVR can capture intricate, nonlinear relationships between input features and the target variable. In this study, SVR was employed to model complex interactions among variables such as vehicle type distributions, meteorological conditions, and time-of-day variations. Its robustness in handling high-dimensional spaces and its effectiveness in modeling nonlinearity make SVR particularly suitable for traffic noise analysis, where traditional linear methods often fall short.
2.5.3. Gaussian Process Regression (GPR)
Gaussian Process Regression (GPR) is a non-parametric, probabilistic modeling technique grounded in Bayesian inference. It assumes that data are generated from a multivariate Gaussian distribution and defines a prior distribution over functions, which is updated using observed data to obtain a posterior distribution. GPR offers both point predictions and credible intervals, making it valuable for applications where uncertainty quantification is critical. In this study, GPR was utilized to model the complex, noisy nature of urban traffic noise without requiring a predefined functional form. Its flexibility and ability to capture smooth, nonlinear trends in LAeq values, along with predictive uncertainty, make it well-suited for estimating traffic noise levels across varying spatial patterns and time intervals.
2.5.4. Bagged Trees Ensemble (BTE)
Bagged Trees ensemble, short for Bootstrap Aggregated Decision Trees, is an ensemble learning method that enhances predictive performance by averaging the outputs of multiple decision trees trained on different bootstrapped subsets of the original data. Each tree is constructed independently, and the final prediction is obtained by aggregating the outputs (e.g., by averaging in regression tasks). This approach reduces variance and mitigates overfitting, particularly in heterogeneous, noisy datasets. In this study, Bagged Trees proved effective in modeling nonlinear interactions and capturing relationships involving both continuous and categorical features. Its ensemble structure offers robustness against data irregularities and measurement noise, making it especially suitable for traffic noise modeling based on real-world urban environmental conditions.
2.6. Model Interpretability Using SHAP
To ensure transparency and interpretability in the machine learning workflow, this study incorporated SHAP (SHapley Additive Explanations) as a post hoc interpretability tool. SHAP, grounded in cooperative game theory, assigns each feature a Shapley value that quantifies its contribution to an individual model prediction [
28,
29,
31]. This approach provides consistent, locally accurate, and globally interpretable insights into how input features influence model outputs.
The integration of SHAP was driven by the growing demand for explainable artificial intelligence (XAI), particularly in environmental and transportation domains where model outcomes inform regulatory decisions and urban policy [
31]. Unlike traditional feature importance measures, SHAP produces additive and model-agnostic explanations that satisfy key properties such as local accuracy, missingness, and consistency, offering mathematically grounded and context-aware interpretation.
In this study, SHAP was applied to the Bagged Trees model using the TreeExplainer algorithm, which is optimized for tree-based ensemble methods. This enabled the efficient computation of exact SHAP values, addressing the limitations associated with approximate approaches that may be less accurate or computationally intensive. The analysis evaluated the influence of eleven input variables, including spatial, temporal, environmental, and traffic-related factors, on the predicted LAeq values.
SHAP implementation was carried out using MATLAB, and visualizations such as summary plots and dependence plots were generated to support interpretation. These visual tools helped identify key contributing features, explore variable interactions, and reveal potential nonlinear or threshold effects.
2.7. Model Evaluation Metrics
To evaluate the performance and robustness of the predictive models developed in this study, three widely accepted regression metrics were employed: the Coefficient of Determination (R2), Root Mean Square Error (RMSE), and Mean Absolute Error (MAE). These metrics were selected to capture complementary aspects of model accuracy and error characteristics, particularly suited for continuous environmental variables such as LAeq (equivalent continuous sound levels).
The Coefficient of Determination (R2) quantifies the proportion of variance in the observed data that is explained by the model. It ranges from 0 to 1, with higher values indicating stronger explanatory power. An R2 value approaching 1 implies that the model accounts for most of the variability in the data, making it especially valuable in environmental modeling, where capturing both spatial and temporal variation is critical.
The Root Mean Square Error (RMSE) measures the square root of the average of squared prediction errors, emphasizing larger deviations due to its quadratic penalty. Expressed in decibels [dB(A)], RMSE provides an interpretable measure of average prediction error magnitude. Its sensitivity to outliers makes it a useful diagnostic for assessing whether a model occasionally produces substantial errors.
The Mean Absolute Error (MAE) represents the average of the absolute differences between predicted and observed values. Also expressed in dB(A), MAE treats all errors equally, offering a robust and intuitive measure of overall predictive accuracy. It is particularly useful in contexts where equal weighting of errors and resistance to outliers are preferred.
All three metrics were computed during both the cross-validation phase (using five-fold cross-validation) and the final testing phase on an independent holdout dataset. This two-stage evaluation approach ensured that model performance was both internally reliable and externally generalizable.
3. Results
3.1. Model Performance Evaluation
In this study, four predictive models were tested for estimating urban traffic noise levels (LAeq) at selected sites across Kuwait: Linear Regression (LR), Support Vector Machine (SVM), Bagged Trees (BTE), and Gaussian Process Regression (GPR). Eleven explanatory variables were used, covering spatial context (Location, RoadType), temporal patterns (PeakOffPeak, WeekendWeekdays, Period), environmental conditions (Temperature, Humidity), and traffic characteristics (TotalTV, LightV, HeavyV, SPEED1). To assess performance, a nested five-fold cross-validation procedure was applied. The dataset was divided into five subsets. In each round, four were used for training and one for validation, and the process rotated until every subset had served as the validation set. The averaged results offered a more reliable estimate of model accuracy and reduced the risk of overfitting. This design also strengthened confidence in the reported performance metrics (R
2, RMSE, MAE) by limiting the variance introduced by random sampling. Furthermore, hyperparameters for all models were optimized within this framework, as shown in
Table 3. In the inner loop, candidate hyperparameter sets were explored through grid search for the SVM and GPR models, and through random subspace sampling for the Bagged Trees Ensemble, while the outer loop provided unbiased estimates of generalization performance. For the SVM, box constraint (C), kernel scale (γ), and epsilon (ε) were tuned within predefined ranges, with the optimal configuration minimizing RMSE. For the GPR, different kernel functions (squared exponential, rational quadratic, and Matern) were evaluated, and the rational quadratic kernel was selected for its balance between accuracy and efficiency. For the Bagged Trees Ensemble, the number of learners and minimum leaf size were varied, with optimal performance achieved at 30 learners and a leaf size of 8. Linear Regression required no tuning beyond standardization. All hyperparameters were fixed to their optimal values during final training, and models were retrained on the full dataset before test set evaluation. All analyses were conducted in MATLAB R2024b which provided built-in functions for model training, hyperparameter optimization, and cross-validation.
Table 4 and
Table 5 present the performance metrics for four predictive models—Linear Regression (LR), Support Vector Machine (SVM), Bagged Trees Ensemble (BTE), and Gaussian Process Regression (GPR)—in estimating traffic noise levels (LAeq) on both the validation and testing datasets.
Among the models, Bagged Trees Ensemble (BTE) demonstrated the best overall performance across all evaluation criteria. On the validation set, BTE achieved the lowest Root Mean Square Error (RMSE) of 2.42, Mean Squared Error (MSE) of 5.85, and Mean Absolute Error (MAE) of 1.51, with a high R
2 value of 0.89. Similar patterns were observed on the testing set, where BTE again outperformed the other models with an RMSE of 2.13, MSE of 4.54, MAE of 1.37, and an R
2 of 0.91, indicating excellent generalization and predictive reliability, as shown in
Figure 3,
Figure 4 and
Figure 5.
Gaussian Process Regression (GPR) followed closely, showing solid performance with validation RMSE and MAE of 2.49 and 1.59, respectively, and an R2 of 0.88. On the testing set, GPR maintained competitive results (RMSE = 2.22, R2 = 0.90), making it a viable option for applications that also require predictive uncertainty estimates.
In contrast, Linear Regression and Support Vector Machine (SVM) exhibited noticeably weaker performance. Both models yielded higher error values and lower R2 scores (around 0.62–0.63) on both datasets, suggesting limited capacity to capture the nonlinear and complex relationships inherent in urban traffic noise patterns. RMSE values for LR and SVM exceeded 4.4 on the validation set, with MAE values above 4.4, reinforcing their inadequacy for this specific prediction task.
Overall, these results highlight the superiority of ensemble learning techniques, particularly Bagged Trees, for modeling urban traffic noise. BTE’s ability to handle nonlinear interactions, heterogeneous input features, and measurement variability made it the most accurate and robust model in this study. GPR also showed promise, especially where interpretability and uncertainty quantification are prioritized. The relatively poor performance of LR and SVM further underscores the importance of using advanced nonlinear models in traffic noise prediction contexts.
3.2. Correlation Analysis of Predictors and LAeq
To explore the relationships in the dataset, we examined Pearson correlations between LAeq and the main predictors. The analysis showed that noise levels rose consistently with higher traffic activity: vehicle speed (r = 0.65), total flow (r = 0.63), and light vehicles (r = 0.63) all displayed strong positive associations, while heavy vehicles also contributed meaningfully (r = 0.53). In contrast, environmental factors such as humidity (r = 0.11) and temperature (r = –0.07) had little influence. The correlation patterns provide a useful baseline for interpreting the machine learning results presented in the following section.
3.3. Interpretability Through SHAP Analysis
To enhance the transparency and explainability of the best-performing model, SHAP (SHapley Additive exPlanations) values were used to interpret how each input feature contributed to the model’s predictions, as shown in
Figure 6. This form of post hoc, model-agnostic interpretation allows for the decomposition of individual predictions into additive contributions from each feature, making it possible to understand not just what a model predicts but why it predicts it.
The SHAP summary plot revealed that RoadType and Location were the most influential features affecting LAeq predictions. Highways were consistently associated with increased noise levels, as indicated by large positive SHAP values. This reflects the substantial impact of road classification, where high-speed and high-volume traffic corridors generate significantly more noise. Likewise, the Location variable reflected geographic heterogeneity, with areas such as Al Farwaniyah and Zahra exhibiting elevated baseline noise levels, likely due to a mix of land use, traffic congestion, and surrounding built environment.
Traffic composition, represented by TotalTV, LightV, and HeavyV, significantly influenced model predictions. Notably, high volumes of light vehicles were not noise-neutral; on collector and arterial roads, frequent acceleration and braking led to elevated LAeq levels. Heavy vehicles, although fewer in number, had a more pronounced impact—especially on highways—owing to their powerful engines, greater mass, and prolonged acoustic exposure.
Environmental factors such as temperature and humidity had a moderate but measurable influence. High temperatures were often associated with elevated noise predictions, possibly due to increased engine output or changes in pavement-tire interaction. Humidity had a lesser effect but likely modulated acoustic propagation in subtle ways. While not dominant, these variables added interpretive nuance and enhanced model robustness.
Temporal variables such as PeakOffPeak, WeekendWeekdays, and Period demonstrated measurable impacts. Peak-hour and weekday observations were consistently linked to higher noise levels, aligning with Kuwait’s traffic flow patterns. The Period variable further revealed subtle differences between morning and afternoon sessions, particularly when coupled with ambient temperature, highlighting the potential for time-sensitive mitigation strategies.
The variable SPEED1 (average speed) showed dual behavior depending on road type. On lower-speed roads, modest speed increases were sometimes linked to smoother traffic flow and reduced noise. Conversely, on highways, higher speeds intensified aerodynamic and mechanical noise components. These effects were clearly depicted in the SHAP distributions and reinforce the need for context-aware interpretation.
4. Operational and Management Implications
This study’s results are relevant to transportation operations, urban planning, and the management of environmental noise, particularly in fast-growing countries such as Kuwait. Using machine learning models in combination with interpretability methods, the research achieved accurate estimates of traffic noise exposure and, at the same time, revealed the main factors that influence its variation. Such understanding can guide both operational practices and policy choices directed toward improving environmental quality and protecting public health.
Some patterns were especially clear. Road classification and the characteristics of individual sites had a strong influence on predicted noise, suggesting the value of location-specific planning. In certain high-exposure areas, the evidence supports measures such as buffer zones, green strips, or purpose-built barriers. Including predictive models in GIS tools could allow these considerations to be addressed during early planning stages rather than after issues appear.
Traffic composition also played a major role. Although heavy vehicles contributed strongly to overall noise, the data also showed that light vehicles generate comparable levels in congested, stop-and-go conditions. Focusing only on freight traffic would therefore not address the problem fully. Techniques such as adaptive signal timing, speed coordination, or congestion pricing could help keep flows steady and reduce noise spikes.
Environmental conditions, although secondary, still affected how noise propagated. Factoring live weather data into monitoring systems may help agencies develop seasonally responsive mitigation strategies.
The results have clear policy relevance. High exceedance rates across sites suggest that existing rules in Kuwait may not capture everyday traffic conditions. National limits could be revised using local evidence to make them more realistic, while still taking guidance from WHO standards. Zoning rules can also help by keeping schools, hospitals, and homes away from busy roads.
Public awareness matters too. Simple habits—like avoiding long idling, keeping vehicles in good condition, and following speed limits—can help bring noise down. Sharing this knowledge with planners and developers can also lead to designs that better protect communities.
Together, these findings show how interpretable ML tools can guide realistic policies and help cities act early to create quieter, healthier neighborhoods.
5. Conclusions and Future Directions
This research aimed to develop a practical machine learning framework for predicting and explaining traffic noise in Kuwait’s residential and mixed-use neighborhoods. By combining measured LAeq values with environmental, spatial, temporal, and traffic-related variables, the study offers a clear, data-grounded view of the factors shaping noise in a rapidly developing urban setting. Of the four models assessed—Linear Regression, SVM, Gaussian Process Regression, and Bagged Trees—the ensemble method achieved the strongest results, with a test R2 of 0.91 and an RMSE of 2.13 dB(A).
Interpretation was a central part of the analysis. SHAP results pointed most strongly to road classification, site location, and heavy vehicle flow, while also showing that the effects of light vehicle volume and speed depended on context. Temperature, humidity, and peak-hour traffic played smaller but consistent roles. Together, these findings confirm what earlier studies have suggested, yet they also provide more precise estimates that can guide targeted planning and noise control.
The developed framework can be applied in other contexts and understood in practical terms, even where older approaches fail to capture the full picture. The results point toward specific measures—restricting heavy vehicle movement during certain hours, adjusting speeds dynamically, and adding buffer zones in areas with high noise exposure—that could help reduce impacts. More broadly, it adds to ongoing work on data-led environmental management and shows how pairing machine learning with transparent interpretation methods can strengthen the way urban policies are shaped and applied.
Future studies may build on this framework in several ways. Extending monitoring over longer periods would capture patterns that short-term observations might miss. Adding acoustic propagation models could improve the accuracy of spatial predictions, while integrating the framework with GIS-based mapping would help planners identify potential noise hotspots before they emerge. In addition, cross-regional comparisons of model performance would further contextualize the results and benchmark Kuwait’s experience against international findings.
Taken together, such improvements would make noise governance more proactive—allowing cities to spot issues sooner, act with better precision, and steadily work toward creating quieter, healthier urban spaces.
Author Contributions
Conceptualization, J.A. (Jamal Almatawah), M.A., H.M., A.A. and J.A. (Jamal Alhubail); methodology, J.A. (Jamal Almatawah), M.A., H.M., A.A. and J.A. (Jamal Alhubail); software, M.A.; analysis and interpretation of the results, J.A. (Jamal Almatawah), M.A., H.M., A.A. and J.A. (Jamal Alhubail); data curation, J.A. (Jamal Almatawah), M.A., H.M., A.A. and J.A. (Jamal Alhubail); writing—original draft preparation, J.A. (Jamal Almatawah), M.A., H.M., A.A. and J.A. (Jamal Alhubail); writing—review and editing, J.A. (Jamal Almatawah), M.A., H.M., A.A. and J.A. (Jamal Alhubail); visualization, M.A.; supervision, J.A. (Jamal Almatawah), M.A., H.M., A.A. and J.A. (Jamal Alhubail). All authors have read and agreed to the published version of the manuscript.
Funding
This works is supported by the Public Authority for Applied Education and Training, Kuwait (Grant No. TS-16-03).
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
Dataset available on request from the authors.
Conflicts of Interest
The authors declare no conflicts of interest.
References
- How Much Does Environmental Noise Affect Our Health? WHO Updates Methods to Assess Health Risks. Available online: https://www.who.int/europe/news/item/04-08-2024-how-much-does-environmental-noise-affect-our-health--who-updates-methods-to-assess-health-risks (accessed on 27 July 2025).
- Arregi, A.; Vegas, O.; Lertxundi, A.; Silva, A.; Ferreira, I.; Bereziartua, A.; Cruz, M.T.; Lertxundi, N. Road Traffic Noise Exposure and Its Impact on Health: Evidence from Animal and Human Studies—Chronic Stress, Inflammation, and Oxidative Stress as Key Components of the Complex Downstream Pathway Underlying Noise-Induced Non-Auditory Health Effects. Environ. Sci. Pollut. Res. 2024, 31, 46820–46839. [Google Scholar] [CrossRef]
- Noise from Roads, Trains or Planes. Available online: https://www.gov.uk/noise-pollution-road-train-plane/noise-from-roads (accessed on 27 July 2025).
- Aircraft Noise. Available online: https://www.icao.int/environmental-protection/aircraft-noise (accessed on 27 July 2025).
- Noise. Available online: https://www.who.int/europe/news-room/fact-sheets/item/noise (accessed on 27 July 2025).
- Pal, D.; Bhattacharya, D. Effect of Road Traffic Noise Pollution on Human Work Efficiency in Government Offices, Private Organizations, and Commercial Business Centres in Agartala City Using Fuzzy Expert System: A Case Study. Adv. Fuzzy Syst. 2012, 2012, 828593. [Google Scholar] [CrossRef]
- Stansfeld, S.A.; Matheson, M.P. Noise Pollution: Non-Auditory Effects on Health. Br. Med. Bull. 2003, 68, 243–257. [Google Scholar] [CrossRef]
- Stansfeld, S.; Berglund, B.; Clark, C.; Lopez-Barrio, I.; Fischer, P.; Öhrström, E.; Haines, M.; Head, J.; Hygge, S.; van Kamp, I.; et al. Aircraft and Road Traffic Noise and Children’s Cognition and Health: A Cross-National Study. Lancet 2005, 365, 1942–1949. [Google Scholar] [CrossRef] [PubMed]
- Evans, G.W.; Bullinger, M.; Hygge, S. Chronic Noise Exposure and Physiological Response: A Prospective Study of Children Living Under Environmental Stress. Psychol. Sci. 1998, 9, 75–77. [Google Scholar] [CrossRef]
- Chauhan, B.S.; Kumar, S.; Garg, N.; Gautam, C. Evaluation and Analysis of Environmental Noise Levels in NCT of Delhi, India. MAPAN 2023, 38, 409. [Google Scholar] [CrossRef]
- Al-Mutairi, N.Z.; Al-Attar, M.A.; Al-Rukaibi, F.S. Traffic-Generated Noise Pollution: Exposure of Road Users and Populations in Metropolitan Kuwait. Environ. Monit. Assess. 2011, 183, 65–75. [Google Scholar] [CrossRef]
- Al-Mutairi, N.; Al-Rukaibi, F.; Koushki, P. Measurements and Model Calibration of Urban Traffic Noise Pollution. Am. J. Environ. Sci. 2009, 5, 613–617. [Google Scholar] [CrossRef]
- Al-Masaeid, H.; Badandi, S. Modeling of Traffic Noise along Urban Arterials. Jordan J. Civ. Eng. 2024, 18, 334–345. [Google Scholar] [CrossRef]
- Al-Masaeid, H.R.; Hani, Z.F.B. Effect of Pavement Roughness on Arterial Noise Using Different Vehicle Types. Int. J. Pavement Res. Technol. 2024, 17, 1367–1376. [Google Scholar] [CrossRef]
- Patel, R.; Singh, P.K.; Saw, S. Traffic Noise Modeling under Mixed Traffic Condition in Mid-Sized Indian City: A Linear Regression and Neural Network-Based Approach. Int. J. Math. Eng. Manag. Sci. 2024, 9, 411–434. [Google Scholar] [CrossRef]
- Danilevičius, A.; Danilevičienė, I.; Karpenko, M.; Stosiak, M.; Skačkauskas, P. Determination of the Instantaneous Noise Level Using a Discrete Road Traffic Flow Method. Promet Traffic Transp. 2025, 37, 71–85. [Google Scholar] [CrossRef]
- Danilevičius, A.; Karpenko, M.; Křivánek, V. Research on the Noise Pollution from Different Vehicle Categories in the Urban Area. Transport 2023, 38, 1–11. [Google Scholar] [CrossRef]
- Mehdi, M.R.; Kim, M.; Seong, J.C.; Arsalan, M.H. Spatio-Temporal Patterns of Road Traffic Noise Pollution in Karachi, Pakistan. Environ. Int. 2011, 37, 97–104. [Google Scholar] [CrossRef] [PubMed]
- Al-Ghonamy, A.I. Assessment of Traffic Noise Pollution in Al-Khobar, a Typical City in the Kingdom of Saudi Arabia. Sci. J. King Faisal Univ. 2009, 10, 165–180. [Google Scholar]
- Jamrah, A.; Al-Omari, A.; Sharabi, R. Evaluation of Traffic Noise Pollution in Amman, Jordan. Environ. Monit. Assess. 2006, 120, 499–525. [Google Scholar] [CrossRef]
- Environment Hong Kong. 2022. Available online: https://www.epd.gov.hk/epd/misc/ehk22/en/pdf1/web/Environment_Hong_Kong_2022_EN.pdf (accessed on 25 August 2025).
- Alrumaidhi, M.; Rakha, H.A. Factors Affecting Crash Severity among Elderly Drivers: A Multilevel Ordinal Logistic Regression Approach. Sustainability 2022, 14, 11543. [Google Scholar] [CrossRef]
- Alrumaidhi, M.; Rakha, H.A. An Econometric Analysis to Explore the Temporal Variability of the Factors Affecting Crash Severity Due to COVID-19. Sustainability 2024, 16, 1233. [Google Scholar] [CrossRef]
- Alrumaidhi, M.; Farag, M.M.G.; Rakha, H.A. Comparative Analysis of Parametric and Non-Parametric Data-Driven Models to Predict Road Crash Severity among Elderly Drivers Using Synthetic Resampling Techniques. Sustainability 2023, 15, 9878. [Google Scholar] [CrossRef]
- Alazemi, F.; Alazmi, A.; Alrumaidhi, M.; Molden, N. Predicting Fuel Consumption and Emissions Using GPS-Based Machine Learning Models for Gasoline and Diesel Vehicles. Sustainability 2025, 17, 2395. [Google Scholar] [CrossRef]
- Ali Khalil, M.; Hamad, K.; Shanableh, A. Developing Machine Learning Models to Predict Roadway Traffic Noise: An Opportunity to Escape Conventional Techniques. Transp. Res. Rec. 2019, 2673, 158–172. [Google Scholar] [CrossRef]
- Kim, P.; Ryu, H.; Jeon, J.-J.; Chang, S.I. Statistical Road-Traffic Noise Mapping Based on Elementary Urban Forms in Two Cities of South Korea. Sustainability 2021, 13, 2365. [Google Scholar] [CrossRef]
- Alsumaiei, A.A. Modeling the Onset of Drought Periods Using Explainable Machine Learning Models Enhanced by Bayesian Optimization. J. Hydrol. Eng. 2025, 30, 04025023. [Google Scholar] [CrossRef]
- Alsumaiei, A.A. Interpretable Machine Learning Framework for Managing Shallow Water Table Rise in Urban Aquifers. Hydrol. Res. 2025, 56, 397–418. [Google Scholar] [CrossRef]
- Fu, Q.; Wu, Y.; Zhu, M.; Xia, Y.; Yu, Q.; Liu, Z.; Ma, X.; Yang, R. Identifying Cardiovascular Disease Risk in the U.S. Population Using Environmental Volatile Organic Compounds Exposure: A Machine Learning Predictive Model Based on the SHAP Methodology. Ecotoxicol. Environ. Saf. 2024, 286, 117210. [Google Scholar] [CrossRef]
- Helbich, M.; Hagenauer, J.; Burov, A.; Dzhambov, A.M. Traffic Noise Assessment in Urban Bulgaria Using Explainable Machine Learning. Sustain. Cities Soc. 2025, 120, 106169. [Google Scholar] [CrossRef]
- Brüel & Kjær 2250 Sound Analysis Specifications—BZ 7222: B&K 2250 BZ 7222. Available online: https://www.gracey.co.uk/specifications/bk-2250-so-7222.htm (accessed on 29 July 2025).
- manual_countCAM2. Product Manual. Available online: https://f.hubspotusercontent10.net/hubfs/6192260/manual_countCAM2.pdf (accessed on 1 August 2025).
- Decatur-Genesis-Scout. Available online: https://guelph.ca/wp-content/uploads/Decatur-Genesis-Scout.pdf (accessed on 10 August 2025).
- Ambient Weather WM-5 Handheld Weather Station. Available online: https://ambientweather.com/wm-5-handheld-weather-station (accessed on 29 July 2025).
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).