1. Introduction
Road traffic noise is a persistent environmental burden in urban streets, affecting both adjacent buildings and outdoor activity spaces [
1,
2]. Beyond traffic flow and vehicle composition, the receptor-side acoustic environment is also shaped by sound propagation through the surrounding urban geometry. Reviews and recent studies show that street width, façade configuration, enclosure characteristics, and related spatial-form variables can substantially influence reflection, attenuation, and local exposure patterns [
3,
4,
5]. Receptor locations close to the traffic lanes, including those in bicycle lanes, are therefore important near-road exposure positions for street design [
6,
7,
8,
9]. Although the effects of individual morphology factors have been discussed repeatedly, an integrated and interpretable framework for predicting noise from street spatial parameters alone is still rare, particularly for receptors close to the carriageway [
1,
4,
5]. We address this gap by developing and comparing interpretable machine-learning models for a cyclist-side simulated SPL under fixed source conditions and by using the RF model to examine the relative importance, marginal effects, and interactions of key spatial variables.
Existing studies on road-traffic-noise prediction can be grouped into two strands [
10,
11]. The first comprises empirical and engineering-based frameworks, including nationally or regionally standardized models such as FHWA and CNOSSOS-EU. These frameworks remain essential for regulatory assessment but are mainly driven by source-related descriptors such as traffic volume, vehicle speed, and fleet composition [
10]. The second comprises data-driven approaches, where artificial-intelligence and machine-learning applications have grown rapidly [
11]. ANN and ensemble-learning models, in particular, can improve predictive performance when sufficient traffic and contextual data are available [
12,
13,
14,
15,
16,
17]. Most of these models, however, are still predominantly source-oriented, while urban-form studies tend to examine a limited set of morphology indicators or focus on block-scale noise distribution rather than receptor-specific exposure in street canyons [
4,
5,
18,
19]. Three questions therefore remain open: how far urban street spatial parameters alone can explain the near-road SPL under fixed source conditions, which parameters contribute most, and whether their effects are nonlinear or interactive.
Morphology-oriented research has shown that urban form can reshape traffic-noise distribution across multiple spatial scales. Street-canyon studies have demonstrated that façade configuration and canyon geometry substantially alter roadside exposure patterns [
20], and Lee and Kang highlighted the role of the height-to-width ratio in sound propagation along urban streets [
18]. Subsequent work extended this perspective from individual canyons to broader urban morphologies [
21,
22,
23,
24,
25]. Forssén et al. showed that different urban morphologies can modify façade, sidewalk, and shielded-yard exposure through both direct and indirect propagation paths [
26]. At the city and block scales, Montenegro et al. reported that urban features support street classification in road-traffic-noise estimation [
1], Zhou et al. identified building layout, road organization, and land-use factors as important determinants of traffic-noise distribution in high-density cities [
4], and Yang et al. proposed a spatial-form-based prediction model for residential blocks [
5]. Explainable machine-learning research further suggests that built-environment and road-related predictors often act on traffic noise in non-linear ways [
15]. Even so, most existing evidence concerns area-wide noise mapping, block-scale exposure, or a relatively small set of geometric descriptors; interpretable, receptor-specific prediction frameworks for near-road street sections remain comparatively rare.
Against this background, the present study addresses three questions: (1) Can urban street spatial parameters alone achieve high predictive performance for sound pressure level at bicycle-lane receptor locations under controlled source conditions? (2) Which spatial parameters contribute most to predictive performance? (3) Do these parameters show threshold-like marginal effects or interaction effects?
To answer these questions, 5060 simulation cases derived from real street morphologies were used to train and evaluate four models: Linear Regression, SVR, XGBoost, and RF. Linear Regression provides a transparent linear baseline; SVR and XGBoost serve as nonlinear benchmarks, included to test whether RF performed favorably only relative to a weak linear comparator. The predictive comparison thus rests on all four models. XGBoost achieved the highest predictive accuracy among the tested models. RF reached comparable accuracy and offers a more direct post hoc interpretation framework through permutation importance and partial dependence analysis, and was therefore used for the interpretation of morphology–SPL relationships. Deep-learning models have shown strong capability in adjacent engineering prediction tasks, such as structural damage identification under varying temperatures, bridge-response prediction, missing measurement-data recovery, and structural-response reconstruction [
27,
28,
29,
30]. The present study, however, uses structured tabular morphology variables rather than image sequences or temporal sensor data, so SVR, XGBoost, and RF were considered more suitable for balancing nonlinear predictive capacity, computational efficiency, and post hoc interpretability in this morphology-based screening task.
Compared with our earlier work on the morphological determinants of street sound propagation [
19], this study advances the analysis on four points: (i) the prediction target shifts from general sound-propagation characteristics to the SPL at bicycle-lane receptor locations, focusing on a specific near-road exposure setting; (ii) the analysis is reformulated as a design-oriented surrogate model for rapid comparative assessment, with post-training prediction efficiency explicitly benchmarked against ODEON simulation time; (iii) linear and nonlinear machine-learning models—Linear Regression, SVR, XGBoost, and RF—are compared under both repeated random-split evaluation and road-name-based grouped holdout validation, while the RF model is reserved for interpretable analysis of variable importance, marginal response patterns, and interaction tendencies; and (iv) the outputs are extended from a list of influential parameters to importance rankings, threshold-like marginal responses, and interaction patterns among key street-form variables. The work is thus a receptor-specific and design-support-oriented extension of the earlier analysis rather than a repetition of it.
2. Materials and Methods
2.1. Overall Workflow
This study builds on our earlier morphology-oriented street-acoustics work [
19] and reformulates it as a receptor-specific prediction task. The previously established street-parameter framework was retained, while the prediction target, receptor setting, and modeling approach were redefined for cyclist-side SPL estimation under controlled simulation conditions.
The workflow comprised eight steps. (1) A total of 5060 street sections from 195 streets in Harbin were sampled and their spatial parameters were extracted. (2) A standardized street-acoustic simulation model was set up in ODEON 14.00 Combined for each section under fixed source conditions. (3) The simulated SPL at the bicycle-lane receptor location was used as the prediction target. (4) Four models—Linear Regression, SVR, XGBoost, and RF—were developed and compared with the same input variables and the same train–test partitions. (5) Inter-variable dependence among the 12 morphological predictors was diagnosed using Pearson correlation coefficients and variance inflation factors. (6) A reduced-variable RF sensitivity analysis examined whether comparable performance could be reached with the dominant width-related variables alone or with a smaller subset of morphology descriptors. (7) RF was used for the subsequent interpretation of feature importance and partial dependence patterns, for the reasons given in
Section 2.5. (8) A computation-time comparison evaluated the post-training prediction efficiency of the machine-learning models relative to ODEON 14.00 Combined simulation.
2.2. Street Samples
The dataset contained 5060 street sections from 195 streets in Harbin, China—2652 arterial-street sections, 1058 secondary-trunk-street sections, and 1350 branch-street sections. Each section was 200 m long. Urban expressways and elevated roads were excluded because their cross-sectional configurations differ substantially from the receptor-oriented street setting adopted here. Harbin was selected for the heterogeneity of its urban fabric, which contains a wide range of traditional and modern street forms suitable for model development. It is treated here as a morphologically diverse case city rather than as a representative proxy for all Chinese cities. The street database follows the same city context and sampling logic as our earlier morphology-oriented street-acoustics study, where the representativeness of Harbin’s street system and the rationale for stratified sampling are described in greater detail [
19].
2.3. Spatial Parameters
The 12 input variables were adapted from a previously developed street-morphology parameter system for urban sound-propagation analysis [
19]. They were retained because they capture three aspects of street form directly relevant to near-road SPL: width configuration, façade proximity, and enclosure characteristics. Specifically,
Wvehicle and
Wside describe the effective source-to-receptor and façade-reflection distances;
Bp indicates whether a façade lies directly in front of the receptor; and
H,
Cs, and
Cp describe zonal façade height, cross-sectional enclosure, and plan enclosure, respectively. The detailed derivation and geometric interpretation of these parameters are reported in the previous study [
19]; only the definitions needed for the present model are summarized here.
Street width—the distance between the façades on both sides of a street—has been widely used in earlier studies of urban-street sound propagation. In practice, however, streets with the same total width may differ substantially in vehicle-lane and sidewalk widths, leading to different source-to-receptor and façade-reflection distances. Total street width was therefore not used directly. Instead, two width-related variables, vehicle-lane width (
Wvehicle) and sidewalk width (
Wside), were used to characterize the width configuration.
Figure 1 illustrates the street configuration, receptor location, zonal division, and spatial parameters used in the model.
The immediate presence of a façade in front of the receptor may also influence the local SPL. Hall et al. reported that a building façade can raise the sound pressure level at a point 2 m in front of it by a mean of 3.2 ± 0.2 dB (95% CI), except at low frequencies [
31]. A binary variable,
Bp, was therefore introduced to indicate whether a building façade exists directly in front of the receptor point:
Bp = 1 indicates presence and
Bp = 0 indicates absence. This variable captures a local reflective condition that may differ even between street sections with otherwise similar geometry.
Because façade-related influences are distance-dependent, each street section was divided into three longitudinal zones according to distance from the bicycle-lane receptor.
Figure 1 also defines these zones along both directions of the street axis. The near zone covers the street portion within 0–30 m of the receptor on both sides, giving a total longitudinal length of 60 m. The mid zone covers the next 30 m band on both sides (30–60 m from the receptor), again 60 m in total length. The far zone covers the following 30 m band on both sides (60–90 m from the receptor), with the same total length of 60 m. This symmetric distance-based division was adopted to represent the expected decay of façade-related reflection effects with increasing distance from the receptor: the near zone captures receptor-proximal façade geometry most likely to influence local SPL and early reflections, the mid zone represents intermediate street-boundary conditions, and the far zone reflects more distant façade-continuity effects. The enclosure-related parameters
H,
Cs, and
Cp were therefore calculated separately for these three zones so that the spatial gradient of morphological influence could be retained.
To facilitate parameter calculation, each street section was centered at the receptor point and divided along the street axis at 3 m intervals. For a 200 m street section, the 61 middle cross-sections were considered. On each cross-section, the façade heights on both sides (Hoi and Hni) were obtained, together with the angles formed by the line from the top of each façade to the center point of the street and the ground. When Hoi or Hni was non-zero, a façade was considered present and the corresponding façade length (Poi or Pni) was recorded as 3 m; otherwise, it was recorded as 0. These geometric quantities formed the basis for calculating H, Cs, and Cp.
The mean façade height (
H) represents the average vertical dimension of street façades within each zone. Because façade height often varies substantially along an urban street, a single height descriptor is insufficient.
H was therefore calculated separately for the near, mid, and far zones, as shown in Equations (1)–(3).
The cross-sectional enclosure degree (
Cs) and plan enclosure degree (
Cp) describe street enclosure from sectional and plan perspectives, respectively, following the framework established in the previous study [
19].
Cs reflects how strongly the receptor is enclosed by the surrounding street cross-section, while
Cp describes façade continuity in plain view and so indirectly indicates the presence of building gaps. Both parameters were also calculated separately for the near, mid, and far zones. Their calculation procedures are given in Equations (4)–(9).
Together, the adopted variables represent complementary dimensions of urban street morphology:
Wvehicle and
Wside describe width configuration;
Bp captures immediate front-façade presence at the receptor;
H describes average façade height;
Cs describes cross-sectional enclosure; and
Cp describes plan enclosure and façade continuity. Calculating
H,
Cs, and
Cp separately for the near, mid, and far zones additionally accounts for the distance-decaying influence of surrounding street geometry on the cyclist-side SPL.
Table 1 summarizes all input variables.
2.4. Acoustic Simulation Settings
ODEON v14.00 Combined (ODEON A/S, Kgs. Lyngby, Denmark) was used to simulate sound propagation in urban street environments. ODEON has been widely applied and previously validated in street-acoustics research [
32,
33], and earlier work has reported acceptable agreement between ODEON-based simulation and field measurements in urban-street contexts, supporting its use for morphology-related sound-propagation analysis [
18,
34,
35].
For each sampled section, a 200 m street model was built. A fixed source-power condition was adopted to isolate the contribution of street morphology to the receptor-side SPL and to avoid confounding from the dynamic traffic-source variability. A receptor point representing a cyclist was placed in the bicycle lane at a height of 1.5 m and 1 m from the edge of the vehicle lane. The height approximates the ear height of a cyclist, and the lateral position represents a near-traffic bicycle-lane condition. Ground and façade absorption/scattering coefficients and other ODEON parameters were set according to previous studies. Under these conditions, the simulations capture variation in the cyclist-side SPL attributable to morphology under a fixed source input.
The absorption and scattering coefficients of the ground were both set to 0.1 [
36]. The absorption coefficient of façades was set to 0.1 [
36], and the scattering coefficient of façades was set to 0.3 [
18]. For ODEON simulation parameters, the number of rays was set to 1,000,000, the transition order to 2, the number of reflection rays to 2000, and the impulse response length to 5000 ms [
18].
2.5. Model Development and Evaluation
The simulated SPL at the bicycle-lane receptor location was used as the prediction target, with the 12 spatial parameters described above as input variables. To address both linear and nonlinear morphology–SPL relationships, four models were evaluated: Linear Regression, support vector regression (SVR), extreme gradient boosting (XGBoost), and Random Forest (RF). Linear Regression provides a transparent linear baseline; SVR and XGBoost serve as nonlinear benchmarks; RF is the model retained for post hoc interpretation, because it combines nonlinear predictive flexibility with a relatively direct interpretation framework based on permutation importance and partial dependence analysis.
The SPL was retained as a continuous regression target rather than discretized into categorical noise-level classes. Continuous prediction preserves the gradual morphology–SPL variations expected under fixed source-power conditions and allows the subsequent partial dependence analysis to recover threshold-like response ranges from the trained model itself, rather than imposing predefined class boundaries before modeling. A categorical formulation is noted in
Section 6 as a planning-oriented extension once external validation against field measurements becomes possible.
For an initial benchmark, the full dataset was randomly divided into a training set (80%) and a test set (20%), with the same split used for all four models. Hyperparameter optimization for SVR, XGBoost, and RF was carried out on the training subset only, using RandomizedSearchCV with five-fold cross-validation. SVR was implemented with a radial-basis-function kernel inside a standardization pipeline because of its sensitivity to predictor scale. XGBoost and RF were fitted with tree-based ensemble regressors. Linear Regression was fitted directly without hyperparameter tuning. The hyperparameter search spaces for SVR, XGBoost, and RF are reported in
Supplementary Table S2.
To test whether model performance was robust to sample-level data partitioning, repeated random-split evaluation was performed across 20 independent 80/20 train–test partitions. In each repetition, the same training and test subsets were used for all four models, and hyperparameter optimization was conducted on the training subset only; the best estimator was then refitted on the full training subset and evaluated on the corresponding test subset.
As a stricter robustness check, road-name-based grouped holdout validation was additionally performed using road name as the grouping variable [
37,
38]. In each of 20 repetitions, approximately 20% of road-name groups were withheld as the test set, ensuring that all samples from a held-out road were excluded from training. For SVR, XGBoost, and RF, hyperparameter tuning was carried out on the training road groups only, using RandomizedSearchCV with GroupKFold cross-validation. The same grouped partitions were used across all four models.
Model performance was assessed with the coefficient of determination (R2), root mean squared error (RMSE), and mean absolute error (MAE). These metrics were calculated for all four models under the illustrative random split, repeated random-split evaluation, and road-name-based grouped holdout validation. XGBoost was used to indicate the highest achievable predictive performance among the tested models, while RF was retained for the subsequent interpretation of feature importance and marginal response patterns, since it reached comparable predictive accuracy and supports a more direct post hoc interpretation framework.
2.6. Inter-Variable Dependence and Multicollinearity Diagnosis
Because the same 12 morphological predictors were used in all four models, the dependence structure among predictors was examined before model interpretation. Pearson correlation coefficients were calculated for all predictor pairs and between each predictor and simulated SPL. Variance inflation factors (VIFs) were also computed to diagnose multicollinearity in the shared predictor set.
The VIF analysis primarily targets the Linear Regression baseline, whose coefficients are sensitive to multicollinearity. For nonlinear models—SVR, XGBoost, and RF—correlated predictors do not invalidate predictive comparison, but they may introduce redundant information and affect the interpretation of feature-importance rankings. Feature importance in tree-based models is therefore interpreted as model reliance under the observed predictor-dependence structure rather than as an independent causal effect.
2.7. Reduced-Variable Sensitivity Analysis
To examine whether the full 12-variable morphology set was needed, a reduced-variable sensitivity analysis was performed using the RF model. This addresses the possibility that cyclist-side SPL prediction is mainly driven by a small number of dominant width-related predictors. Five nested predictor sets were compared: Wvehicle only; Wvehicle and Wside; Wvehicle, Wside, and Cs(n); Wvehicle, Wside, Cs(n), Cp(n), and H(n); and the full 12-variable morphology set.
The same road-name-based grouped holdout framework used in the main model evaluation was adopted. In each repetition, approximately 20% of road-name groups were withheld as the test set, so that all samples from the same road group were kept within either the training or test subset. For each reduced-variable RF model, hyperparameter optimization was performed on the training road groups only, using RandomizedSearchCV with GroupKFold cross-validation. The same grouped partitions were used across all predictor sets, so that performance differences reflected differences in input-variable composition rather than in data splitting.
This analysis was used to determine whether the full morphology set provided additional predictive or interpretive value beyond the dominant width-related variables. The results are reported in
Supplementary Table S4.
2.8. Model Interpretation
For the reasons given in
Section 2.5, RF was used for the subsequent interpretation analysis. Permutation feature importance and partial dependence analysis were applied to examine how the trained RF model used the morphological predictors. Permutation importance quantifies the decrease in predictive accuracy after random reshuffling of a given feature [
39,
40]. Partial dependence plots (PDPs) describe average marginal response patterns of predicted SPLs across the observed predictor ranges [
41,
42]. Two-dimensional PDPs were also used to explore whether the influence of selected spatial parameters varied under different vehicle-lane-width conditions [
41].
Because several predictors were intercorrelated, the interpretation of these tools was deliberately constrained. Permutation importance is read as RF model reliance under the observed predictor-dependence structure, not as the independent causal contribution of each variable. PDPs likewise summarize average model-based response tendencies, rather than isolated physical effects across all possible variable combinations.
2.9. Computational Efficiency Comparison
To test whether the proposed framework provides a practical computational advantage as a morphology-screening tool, a computation-time comparison was performed between ODEON simulation and machine-learning prediction. The ODEON simulation time was obtained from 93 representative street-section models drawn from the original simulation records. These models covered different street configurations and were simulated using the same ODEON version and acoustic settings described in
Section 2.4.
All timing tests were run on the same workstation, an Intel Core i9-9900K CPU with 64 GB RAM running Windows 10 64-bit. For the machine-learning models, training time and post-training prediction time were measured in Python 3.12.7 on the same dataset and hardware. Training time and prediction time are reported separately because the practical efficiency of a surrogate model mainly concerns rapid prediction after training. Prediction time was measured for all 5060 street sections and converted to an average per-section value. The complete computation-time comparison is reported in
Supplementary Table S5.
3. Results
3.1. Descriptive Statistics, Preliminary Relationships, and Inter-Variable Dependence
Figure 2 summarizes the distributions of the 12 spatial parameters and the simulated SPL values. Among the width-related variables,
Wside was concentrated mainly within 0–21 m, whereas
Wvehicle was more evenly distributed across a wider range of 7–42 m. For the zonal variables
H,
Cs, and
Cp, distribution patterns were broadly similar across the near, mid, and far zones.
H and
Cs both showed skewed distributions, with
H concentrated mainly between 6 and 24 m and
Cs between 0.1 and 0.6.
Cp was distributed more heavily towards higher values, suggesting that street sections with very low plan-enclosure values were uncommon in the sampled dataset. The simulated SPL values were mainly distributed between 62 and 70 dB.
The correlation matrix in
Figure 3 shows that simulated SPL was most strongly correlated with the width-related variables. Both
Wvehicle and
Wside were negatively correlated with SPL, indicating that larger source–receptor and receptor–façade distances were generally associated with lower predicted noise levels. The correlation was stronger for
Wvehicle (r = −0.84,
p < 0.01) than for
Wside (r = −0.60,
p < 0.01), suggesting that vehicle-lane width was the more influential width descriptor at the bivariate level. Among the enclosure-related variables,
Cs showed the strongest positive correlations with SPL (r = 0.68–0.72,
p < 0.01), while
Cp showed weaker positive correlations (r = 0.22–0.35,
p < 0.01).
Bp had a weak positive correlation with SPL (r = 0.20,
p < 0.01), and
H showed only very low positive correlations (r = 0.04–0.09,
p < 0.01).
Predictor–predictor correlations further showed that the 12 morphological variables were not fully independent. Within the same zone, Cs and Cp were strongly correlated, with correlation coefficients of approximately 0.70–0.71. Moderate correlations were also observed between H and the enclosure-related variables. For the same morphological descriptor, inter-zone correlations were evident, especially for Cs. The zonal variables thus share some common geometric information, even though they still represent distance-specific descriptions of street morphology.
The correlation between the two most influential width-related variables, Wvehicle and Wside, was moderate rather than high (r = 0.33). They should therefore not be treated as redundant measures of a single width factor: Wvehicle mainly represents the source–receptor distance, while Wside mainly represents the receptor–façade reflection distance. The two variables are related components of street-width configuration but describe different acoustic mechanisms.
VIF analysis confirmed that multicollinearity was concentrated among the zonal enclosure and height variables rather than between
Wvehicle and
Wside (
Supplementary Table S3).
Cs(n),
Cs(m), and
Cs(f) showed high VIF values, followed by the
H-related variables.
Wside and
Bp had low VIF values, and
Wvehicle showed only moderate multicollinearity. The Linear Regression coefficients should therefore not be read as independent morphological effects, and Linear Regression was retained only as a transparent predictive baseline. For the nonlinear models, the correlated predictor structure does not invalidate model comparison but requires cautious interpretation of feature-importance results.
3.2. Model Performance
To address the concern that comparison with only a linear baseline was insufficient, two additional nonlinear models, SVR and XGBoost, were included.
Table 2 summarizes the predictive performance of Linear Regression, SVR, XGBoost, and RF under the illustrative random split, repeated random-split evaluation, and road-name-based grouped holdout validation.
In the illustrative random 80/20 split (
Figure 4), all nonlinear models clearly outperformed Linear Regression. The linear baseline reached an R
2 of 0.900, an RMSE of 0.895 dB, and an MAE of 0.722 dB. SVR substantially improved the prediction (R
2 = 0.990, RMSE = 0.289 dB, MAE = 0.173 dB), and XGBoost achieved the highest accuracy in this split (R
2 = 0.997, RMSE = 0.155 dB, MAE = 0.103 dB). RF showed similarly strong performance (R
2 = 0.996, RMSE = 0.183 dB, MAE = 0.112 dB). Predicted values from XGBoost and RF aligned closely with the simulated SPL values, while Linear Regression showed greater dispersion around the 1:1 line. The morphology–SPL relationship in this dataset was thus better captured by nonlinear models than by a purely linear one.
Across 20 repeated random 80/20 splits, the same overall ranking held. Linear Regression yielded R2 = 0.897 ± 0.006, RMSE = 0.891 ± 0.019 dB, MAE = 0.718 ± 0.016 dB. SVR improved performance to R2 = 0.989 ± 0.002, RMSE = 0.290 ± 0.031 dB, MAE = 0.181 ± 0.031 dB. XGBoost achieved the highest repeated-split performance (R2 = 0.996 ± 0.001, RMSE = 0.167 ± 0.011 dB, MAE = 0.107 ± 0.004 dB), and RF was comparable but with slightly higher errors (R2 = 0.996 ± 0.001, RMSE = 0.185 ± 0.012 dB, MAE = 0.113 ± 0.004 dB).
Under the stricter road-name-based grouped holdout validation, predictive performance decreased for all models, indicating that ordinary random splitting benefited partly from within-road similarity. The nonlinear models nevertheless retained clear advantages over Linear Regression. Linear Regression yielded R2 = 0.876 ± 0.027, RMSE = 0.951 ± 0.079 dB, MAE = 0.778 ± 0.073 dB. SVR improved grouped-holdout performance to R2 = 0.922 ± 0.018, RMSE = 0.755 ± 0.091 dB, MAE = 0.549 ± 0.084 dB. XGBoost achieved the highest grouped-holdout accuracy (R2 = 0.953 ± 0.018, RMSE = 0.583 ± 0.119 dB, MAE = 0.418 ± 0.082 dB), and RF retained strong performance (R2 = 0.938 ± 0.041, RMSE = 0.662 ± 0.210 dB, MAE = 0.453 ± 0.128 dB).
Across all evaluation settings, the nonlinear models outperformed the linear baseline. XGBoost was the most accurate predictor; RF, with comparable accuracy, is used in the following sections for interpretation.
Because all four models used the same 12 morphological predictors, the performance comparison was conducted under the same predictor-dependence structure identified in
Section 3.1. The lower performance of Linear Regression should therefore be read with care: it may reflect both the limited ability of a purely linear model to capture nonlinear morphology–SPL relationships and the sensitivity of linear coefficient estimates to multicollinearity. Including SVR and XGBoost reduces the risk that RF was compared only with a weak linear baseline, while keeping the subsequent interpretation focused on RF for its balance between predictive performance and post hoc interpretability.
3.3. Reduced-Variable Model Comparison
A reduced-variable RF analysis examined whether comparable performance could be reached with a smaller subset of predictors. The results showed a clear hierarchy in predictor contribution (
Supplementary Table S4). The model using
Wvehicle alone achieved R
2 = 0.724 ± 0.063, RMSE = 1.415 ± 0.113 dB, MAE = 1.107 ± 0.087 dB. The dominant vehicle-lane-width descriptor thus captured an important share of the morphology-related SPL variation but was not sufficient on its own.
Adding Wside substantially improved performance, raising R2 to 0.887 ± 0.043 and reducing RMSE and MAE to 0.898 ± 0.160 dB and 0.631 ± 0.091 dB, respectively. The two width-related variables together represented the primary width-configuration signal. Adding Cs(n) further improved the model (R2 = 0.908 ± 0.031, RMSE = 0.813 ± 0.141 dB), suggesting that local cross-sectional enclosure adds information beyond width configuration alone.
Adding Cp(n) and H(n) produced only marginal further gain, with R2 rising slightly from 0.908 ± 0.031 to 0.910 ± 0.032. The full 12-variable model achieved the best overall RF performance (R2 = 0.938 ± 0.041, RMSE = 0.662 ± 0.210 dB, MAE = 0.453 ± 0.128 dB). Cyclist-side SPL prediction was therefore width-dominated but not purely width-determined. The full variable set is retained as a complete morphology representation, rather than as evidence that all 12 variables contribute equally or independently.
3.4. Relative Importance of Spatial Parameters
The reduced-variable analysis showed that width-related variables accounted for the main predictive signal, while the full 12-variable set still gave the best RF performance. To examine how the full RF model used the individual predictors, permutation importance was calculated for the 12-variable RF model (
Figure 5).
Permutation importance was unevenly distributed across the 12 predictors. Because the correlation and VIF analyses indicated intercorrelations among several predictors, the importance values are read as the extent to which the trained RF model relied on each variable under the observed predictor-dependence structure, rather than as independent causal effects. Wvehicle ranked first (importance = 1.008 ± 0.0145), followed by Wside and Cs(n). The importance of Wvehicle was approximately four times that of Wside and approximately 30 times that of Cs(n). Wvehicle and Wside together accounted for 93.9% of the total permutation importance, indicating that width-related variables dominated RF prediction of cyclist-side SPL.
Although the importance value of Cs(n) did not exceed 0.05, it remained higher than that of the other non-width variables, suggesting that local cross-sectional enclosure provided secondary information to the RF model. Given the correlations among the zonal variables, this should be read as evidence of additional model reliance on receptor-proximal enclosure information rather than a fully independent effect of Cs(n) alone. For the zonal variables Cs, Cp, and H, permutation importance generally decreased from the near zone to the mid zone and then to the far zone. This distance-decay pattern indicates that morphological features closer to the receptor contributed more to SPL prediction than those farther away. Overall, basic width configuration dominated model performance, while enclosure-related variables played secondary but still interpretable roles, especially in zones closest to the receptor. Given the dominant role of width-related variables and the non-negligible contribution of selected enclosure-related parameters, the marginal effects of individual predictors were further examined using partial dependence analysis.
3.5. Model-Based Average Marginal Response Patterns of Spatial Parameters
Partial dependence analysis showed that several spatial parameters exhibited threshold-like average response patterns in predicted SPLs (
Figure 6). Among all predictors, the width-related variables showed the strongest and most continuous patterns within the trained model. The PDP for
Wvehicle has four stages. When
Wvehicle was below approximately 13 m, increasing it had little effect on predicted SPL. As
Wvehicle increased from 13 to 22 m, predicted SPL decreased markedly, by approximately 3.1 dB. From 22 to 50 m, predicted SPL continued to decrease more gradually, by approximately 2.5 dB. Beyond 50 m, additional change was negligible. A similar but weaker negative pattern appeared for
Wside, whose influence was concentrated mainly in the lower-width range. The PDPs thus show that width-related parameters dominated the model and that their response patterns were most pronounced within specific value ranges rather than across the entire parameter domain.
Compared with the width-related variables, Cs(n) showed a smaller but still interpretable positive effect. SPL changed little when Cs(n) increased from 0 to approximately 0.22, then increased by about 0.65 dB as Cs(n) rose to approximately 0.46, after which the curve largely flattened. The binary front-façade indicator Bp also showed a distinct marginal effect: compared with Bp = 0, the presence of a façade directly in front of the receptor (Bp = 1) raised SPL by approximately 1.34 dB. Local enclosure and immediate front-façade presence can therefore elevate cyclist-side SPL, but their effects are clearly smaller than those of the width-related variables.
Additional one-dimensional PDPs for
Cs(m),
Cs(f),
H(n),
H(m),
H(f),
Cp(n),
Cp(m), and
Cp(f) are provided in
Supplementary Figure S1. These curves show weaker or more threshold-dependent tendencies than those in
Figure 6: the
Cs curves generally weakened from the near zone to the far zone, the
H curves remained relatively flat across most of their ranges, and the
Cp curves became more influential only when plan enclosure approached near-complete continuity.
3.6. Exploratory Interaction Patterns
Given its dominant importance,
Wvehicle was used as the conditioning variable in the exploratory interaction analysis.
Figure 7 shows the two-dimensional partial dependence plots for
Wvehicle in combination with
Wside,
Cs(n), and
Cs(m).
Figure 7a confirms that both
Wvehicle and
Wside were negatively associated with predicted SPL, consistent with the one-dimensional PDP results. Their joint response pattern, however, was not uniform across the full parameter space. When
Wvehicle was below approximately 15 m, the influence of
Wside on predicted SPL was concentrated mainly within the range of 5–10 m. When
Wvehicle exceeded 15 m, the effective range of
Wside broadened, and its influence was concentrated mainly within 5–15 m. The marginal benefit of increasing the sidewalk width thus varied with the vehicle-lane width rather than remaining constant across all street sections.
Figure 7b,c show that the average response patterns of
Cs(n) and
Cs(m) were also conditioned by
Wvehicle. When
Wvehicle exceeded approximately 20 m, the marginal patterns of
Cs(n) and
Cs(m) became broadly similar: in both cases, predicted SPL increased gradually as enclosure rose from 0 to 0.5 and changed little as enclosure rose from 0.5 to 0.8. When
Wvehicle was below 20 m, however, the response pattern of
Cs(n) was clearly stronger than that of
Cs(m). These exploratory interaction patterns suggest that enclosure-related variables matter more under relatively narrow vehicle-lane conditions and that their marginal influence weakens once the primary width condition has improved.
3.7. Computational Efficiency
The computation-time comparison further supports the use of the trained machine-learning models as rapid morphology-screening tools. Across the 93 representative ODEON simulation records, the median simulation time for one street section was 2 min 33 s, with an interquartile range from 2 min 16 s to 3 min 22 s. The mean simulation time was 3 min 26 s, and the maximum recorded time was 19 min 38 s. Using the median ODEON time as the reference, simulating all 5060 street sections would take approximately 215.05 h; using the mean time, this would take approximately 289.56 h.
The trained machine-learning models required substantially less post-training prediction time. XGBoost required 23.24 s for training and 0.013 s to predict SPL values for all 5060 sections. RF required 195.08 s for training and 0.143 s to predict all 5060 sections. The corresponding batch prediction time per section was approximately 0.003 ms for XGBoost and 0.028 ms for RF. Relative to the median ODEON simulation time, this corresponds to a post-training batch-prediction speed-up of approximately 5.96 × 107 for XGBoost and 5.42 × 106 for RF.
These results do not imply that the machine-learning models can replace ODEON simulation or field measurement. Rather, they show that, once an ODEON-generated training dataset is available, the trained models can rapidly approximate ODEON-simulated SPL for additional morphology-based screening cases.
4. Discussion
4.1. Principal Findings in Relation to Previous Studies
Three principal findings emerged from the revised analysis. First, nonlinear machine-learning models predicted the cyclist-side SPL more accurately than Linear Regression under both random-split and road-name-based grouped holdout validation. Second, the reduced-variable analysis indicated that the morphology–SPL relationship was width-dominated but not purely width-determined: Wvehicle and Wside captured the primary width-configuration signal, while adding Cs(n) and the full 12-variable morphology set further improved RF performance. Third, enclosure-related variables contributed secondary and condition-dependent information, especially for receptor-proximal zones and narrower vehicle-lane-width conditions. Together, these findings suggest that near-road SPL prediction from street morphology is governed primarily by street-width configuration, while façade-related descriptors add more limited and context-dependent information.
The reduced-variable analysis also clarifies how the 12-variable input set should be read. The data do not support a view in which all 12 variables contribute equally or independently. The RF model was primarily driven by width configuration, especially Wvehicle and Wside; yet the two-variable width model did not fully reproduce the performance of the full morphology model. Cs(n) added information on local cross-sectional enclosure, and the full 12-variable set produced a further small improvement under grouped holdout validation. The full variable set is therefore best understood as a complete morphology representation for comparative screening, not as evidence that every individual parameter is indispensable.
The dominant role of width-related variables is broadly consistent with earlier studies showing that street width and canyon configuration substantially influence reflection, attenuation, and local traffic-noise exposure in urban streets [
18,
19,
20,
43]. Many of those studies, however, focused either on block-scale distribution patterns or on source-oriented prediction [
4,
5,
10,
11]. The present analysis extends this line of work to receptor-specific prediction at bicycle-lane locations under fixed source conditions. The contribution is not to restate that street width matters, but to show that, for the cyclist-side SPL close to the carriageway, width-related descriptors outweigh other morphology variables in predictive performance.
A second result of note is the distance-decay pattern of morphology-related variables and the non-uniform role of enclosure-related variables such as
Cs and
Cp. The stronger contribution of near-zone variables agrees with previous work showing that receptor-proximal street geometry is particularly influential for local SPL prediction [
19,
31]. At the same time, the limited but non-negligible role of
Cs, together with the threshold-like behavior of
Cp in
Supplementary Figure S1, shows that enclosure-related parameters are not universally dominant predictors. Their contribution depends on both receptor proximity and the primary width condition of the street, refining earlier morphology-oriented discussions by showing that enclosure effects are configuration-dependent and sensitive to receptor location [
19,
20].
4.2. Mechanistic Interpretation of Morphology–Noise Relationships
The dominant role of
Wvehicle can be explained by its direct control over the propagation distance between the traffic source and the bicycle-lane receptor. Under fixed source-power conditions, increasing the effective source–receptor separation lengthens the propagation path of the direct sound component before it reaches the receptor, and so reduces local SPL more effectively than many other geometric adjustments.
Wside, by contrast, mainly affects the distance associated with façade-related reflections rather than the direct source–receptor path. This helps explain why both width-related variables showed negative marginal effects but
Wvehicle was markedly more influential than
Wside in both the permutation-importance ranking and the partial dependence analysis [
18,
19,
20,
34,
43]. The threshold-like behavior in the PDPs further indicates that the acoustic benefit of widening is strongest within relatively constrained ranges and then diminishes.
The stronger effects of near-zone parameters are also physically plausible. Street geometry close to the receptor is more likely to shape the local balance between direct sound and early façade reflections, whereas features farther away contribute less to SPL at a fixed receptor point. This reading is consistent both with the geometric-reflection principle adopted in the present parameter system and with earlier morphology-oriented studies showing that receptor-proximal street form is especially relevant to local sound propagation [
19,
20,
34,
36,
43]. The progressive decline in correlation strength and permutation importance from the near zone to the far zone therefore supports a distance-decay interpretation.
The PDPs in
Figure 6 and
Supplementary Figure S1 also clarify the differing roles of
Cs,
H, and
Bp. Compared with mean façade height, cross-sectional enclosure provides a more integrated description of how strongly the receptor is surrounded by reflective boundaries within the street section, which likely explains why
Cs showed clearer and more consistent marginal effects than
H, whose influence flattened above a moderate height range. The non-negligible effect of
Bp suggests in turn that the immediate presence of a façade directly in front of the receptor alters the local reflective condition more than the average height of façades distributed along the street segment [
31,
32,
36]. For SPL prediction close to the carriageway, in other words, local reflective geometry matters more than height alone.
The behavior of
Cp in
Supplementary Figure S1, taken together with the interaction results, indicates that enclosure effects are strongly configuration-dependent. Across much of its range,
Cp exerted only limited marginal influence, yet its effect increased noticeably as façade continuity approached complete closure. Plan enclosure is therefore unlikely to act as a uniformly strong predictor unless the street boundary becomes nearly continuous and the reflective environment correspondingly more uniform. The two-dimensional PDPs further show that the effects of
Cs(n) and
Cs(m) were more pronounced under relatively narrow
Wvehicle conditions and weakened once the primary width condition had improved. Street width and enclosure should therefore be read together rather than as independent controls [
19,
20,
36,
43].
4.3. Planning Implications and Design Relevance
From a planning and design standpoint, width-related variables should be prioritized when comparing the morphology-related cyclist-side SPL under fixed source conditions [
43]. The implication that follows from the dominance of
Wvehicle and
Wside concerns geometric separation, not motor-vehicle lane width as such: widening motor-vehicle lanes in real streets may induce higher traffic volumes or vehicle speeds, which could offset or even reverse the acoustic benefit predicted under fixed source-power conditions. The morphology-based strategy that follows most directly from the present results is therefore to increase the effective source-to-receptor separation, or to increase the separation between the bicycle lane and adjacent reflective boundaries, particularly within the parameter ranges where the PDPs showed the greatest SPL sensitivity [
18,
19,
20,
43].
The results also suggest that enclosure-related interventions matter more in relatively narrow streets. Where vehicle-lane width is constrained, reducing local cross-sectional enclosure may provide additional SPL reduction, whereas the benefit of enclosure-related adjustment becomes smaller once the primary width condition has improved. The behavior of
Cp in
Supplementary Figure S1 likewise suggests that introducing a small interruption in an otherwise continuous street façade may help reduce SPL, while further increasing the size or number of building gaps is unlikely to produce proportionally larger benefits. In design terms, increasing effective separation is the primary morphology-screening criterion, while enclosure modification is best treated as a secondary and context-dependent refinement, particularly for narrow or strongly bounded street sections [
19,
20,
43].
4.4. Interpretation Boundary and Methodological Implications
The framework treats a morphology-based simulation chain as a receptor-specific prediction task for cyclist-side SPLs. Its primary value lies in comparative screening: training on a single-city, fixed-source-power dataset constrains the scope to morphology comparison, not regulatory prediction. The computation-time comparison clarifies this role. ODEON simulation remains necessary for generating physically based acoustic outputs and for building the training dataset, but the trained machine-learning models can approximate the ODEON-simulated morphology–SPL mapping with negligible post-training prediction time. The framework is therefore most useful for early-stage comparison of alternative street-morphology configurations, where many design cases may need to be screened rapidly before detailed acoustic simulation or field validation is undertaken.
Including SVR and XGBoost reduces the concern that RF was compared only with a linear baseline. All four models were nevertheless trained on the same 12 morphological predictors, and the correlation and VIF analyses showed that this predictor set was not fully independent. The four-model comparison should accordingly be read as a predictive-performance comparison under the same correlated morphology-variable structure, not as a basis for inferring independent causal effects of individual predictors.
The same caution applies in different forms to the linear and the tree-based models. For the Linear Regression baseline, multicollinearity directly affects coefficient estimates: the linear model remains useful as a transparent predictive reference, but its coefficients should not be read as independent effects of individual morphological variables, and the relatively lower performance of Linear Regression may reflect both the nonlinear nature of the morphology–SPL relationship and the instability of coefficient estimates under correlated predictors. For tree-based models such as RF and XGBoost, correlated predictors do not generally prevent accurate prediction, but they can influence feature-importance rankings, since importance may be shared or redistributed among correlated variables. The RF permutation-importance results reported here are therefore read as model reliance under the observed predictor-dependence structure rather than as estimates of independent causal contribution.
The same logic applies to the two width-related predictors and to the zonal variables. Wvehicle and Wside were only moderately correlated, so their combined dominance reflects a width-configuration-dominated morphology–SPL relationship rather than an artefact of duplicated predictors: Wvehicle mainly captures source–receptor distance, while Wside captures receptor–façade reflection distance. The zonal H, Cs, and Cp variables are correlated because adjacent zones belong to the same street section; they were retained because they encode distance-specific geometric information and allow the model to test whether receptor-proximal morphology contributes more strongly than morphology farther from the receptor. Their importance pattern therefore reads as distance-sensitive model reliance rather than as fully isolated zone-specific causal effects.
5. Conclusions
Drawing on 5060 ODEON-simulated street sections from 195 streets in Harbin and 12 morphology-related input variables, we compared Linear Regression, SVR, XGBoost, and RF under an illustrative random split, repeated random-split evaluation, and road-name-based grouped holdout validation to predict the cyclist-side simulated SPL under fixed source-power conditions. The nonlinear models substantially outperformed the linear baseline. XGBoost achieved the highest predictive accuracy among the tested models, while RF reached comparable accuracy and was retained for the interpretation of feature importance and marginal response patterns.
Within this simulation framework, street morphology alone accounts for most of the variation in the cyclist-side SPL, with street-width configuration as the dominant predictor. Wvehicle and Wside were the two most influential variables, together accounting for 93.9% of the total permutation importance. Morphological parameters closer to the receptor exerted stronger effects than those in the mid and far zones, underscoring the role of receptor-proximal street geometry in near-road sound propagation. Most predictors showed threshold-like marginal effects, with their acoustic influence weakening once certain values were reached.
Among the enclosure-related variables, Cs remained relevant, especially under relatively narrow Wvehicle conditions, while the effect of Cp was limited across much of its range and became more pronounced only as façade continuity approached complete closure. Width-related parameters should therefore be prioritized in morphology-based screening and design comparison under fixed source conditions, with enclosure-related parameters treated as secondary and configuration-dependent.
Overall, the framework offers a fast, morphology-sensitive screening tool for early-stage comparison of a cyclist-side simulated SPL during street planning and design. Its practical value lies in screening morphology alternatives efficiently once an ODEON-generated training dataset is available; it does not replace detailed acoustic simulation, field measurement, or regulatory traffic-noise assessment.
6. Limitations and Future Work
The study has several limitations.
The modeling dataset was derived from simulated street samples in a single city, which may limit transferability to other urban contexts with different street morphologies and building configurations. The acoustic simulations were also performed under a fixed source-power condition, so the model captures the relative effect of street morphology on cyclist-side SPL under controlled conditions rather than the full variability of real traffic noise under changing traffic flow, vehicle composition, and speed. The computational efficiency reported here should accordingly be understood as the efficiency of approximating ODEON-simulated SPL after model training, not as evidence that detailed acoustic simulation or field measurement can be omitted in the final assessment.
Several potentially relevant factors—including meteorological conditions, façade material diversity, and temporal variation in traffic states—were not explicitly incorporated. Although robustness was examined using both repeated random-split evaluation and road-name-based grouped holdout validation, the model was evaluated only within the present single-city simulation framework and was not tested against external measurements, independent datasets, or street samples from other cities.
Several morphological predictors were also correlated, especially the zonal enclosure and height variables. This does not invalidate the predictive comparison among Linear Regression, SVR, XGBoost, and RF, but it limits the interpretation of individual coefficients and feature-importance rankings. The reported RF importance values and PDP-based response patterns should therefore be read as model-based summaries under the observed predictor-dependence structure rather than as independent causal effects.
Future work should validate the framework with field measurements across multiple cities and a wider range of street types. Incorporating dynamic traffic-source descriptors—such as traffic volume, vehicle speed, and heavy-vehicle proportion—would also improve applicability under real operating conditions. Further extensions could integrate façade acoustic properties, meteorological effects, a broader set of urban morphological indicators, and stronger external validation strategies to improve generalizability and practical relevance. Categorical noise-level prediction, for example, using low-, medium-, and high-noise classes based on established acoustic thresholds, is a further direction that may reduce sensitivity to absolute simulation errors and improve interpretability for planning use.