Road traffic safety is central to socially resilient and sustainable cities, yet many middle-income countries lack harmonized subnational data on exposure, infrastructure, and enforcement. This study examines whether routinely available demographic composition can serve as a practical structural baseline for provincial traffic accident
[...] Read more.
Road traffic safety is central to socially resilient and sustainable cities, yet many middle-income countries lack harmonized subnational data on exposure, infrastructure, and enforcement. This study examines whether routinely available demographic composition can serve as a practical structural baseline for provincial traffic accident rates and as a diagnostic layer for richer safety models. Using official province–year data from Türkiye (2008–2019 and 2022–2024;
n = 1215), demographic shares by sex, education, and age were treated as compositional inputs and transformed using isometric log-ratio (ILR) methods, with GDP per person included as a scalar covariate. A Tabular Residual Network (ResNet) was trained on the historical panel and evaluated on a post-period calibration/evaluation window (2022–2024), which was used for checkpoint selection and seed screening rather than as an independent held-out test set. Among the evaluated specifications, the ResNet seed-ensemble achieved the strongest performance on the 2022–2024 calibration/evaluation period (R
2 = 0.5717), outperforming the best single-seed model (R
2 = 0.5539), a province-specific last-value-carried-forward temporal heuristic based on 2019 values (R
2 = 0.4779), tree-based tabular benchmarks (Random Forest: R
2 = 0.1328; XGBoost: R
2 = 0.0706), and pooled statistical reference models (linear: R
2 = 0.1375; negative binomial: R
2 = 0.0686; Poisson: R
2 = −0.0634). Year-wise diagnostics indicated gradual temporal drift, suggesting that periodic recalibration or the inclusion of additional policy-relevant covariates is needed to preserve calibration. Overall, ILR-based compositional geodemography provides a scalable and interpretable baseline for traffic safety monitoring and prioritization in data-constrained settings.
Full article