4. Limitations
The present study defines a sensor fusion and signal-processing framework evaluated through simulation-based sensitivity analysis. Empirical validation is defined in the roadmap (
Table 9) and will be addressed in subsequent phases of the research programme. Specifically, Phase 2 will produce the first experimental performance data from instrumented crash-dummy trials, and Phase 3 will characterise acoustic domain shift under real-world recording conditions.
The CNN–BiLSTM classifier is a design specification grounded in the temporal structure of kinematic signals [
15,
16]; no accuracy, AUC, or F1 scores are reported or implied before Phase 1 FEA training. The classifier is explicitly deferred to future work. Training, hyperparameter optimisation, train/validation/test splitting, and full evaluation (confusion matrix, AUC, ECE, external validation) are defined deliverables of Phase 1 (
Table 9) and will be reported in a dedicated subsequent publication. The Viano hepatic stress formula [
19], whilst physically grounded in continuum mechanics, was validated under controlled cadaveric impact conditions; its accuracy across the full range of in situ collision geometries has not been independently established. All kinematic signatures in
Table S4 (Supplementary Materials) are proposals derived from first principles and the validated sensor results of Barua and Hitosugi [
5]; they are hypothesis-generating rather than clinically validated criteria. The sensor platform of Barua and Hitosugi [
5] is a prospective detection system; retrospective reconstruction from existing CCTV footage requires adequate video resolution and frame rate, which cannot be guaranteed across all case types. The Kalman filter’s optimality guarantee holds under Gaussian noise and linear kinematics; non-linear pedestrian motions and missing landmark detections may introduce sub-optimality that the Monte Carlo propagation only partially accounts for.
Temporal synchronisation between the acoustic and kinematic channels in retrospective forensic scenarios requires bounding of the clock offset as follows:
where
c is the speed of sound,
d_sensor is the distance to the recording device, and
δ_drift represents the uncalibrated clock offset. Bounding these variables constitutes a key operational challenge for retrospective application. A practical retrospective synchronisation procedure proceeds in three steps: (i) identify a common impulse event detectable in both the scene recording and the IMU channel (e.g., the acoustic impact transient at t = 0); (ii) cross-correlate the IMU-derived acceleration envelope with the audio energy envelope to estimate the offset τ with sub-frame precision, utilising the structural vibration transient as a common physical ‘time-mark’ visible in both sensors; (iii) apply τ as a shift to all acoustic timestamps and propagate its residual uncertainty (±δ_drift) through Equation (12). For GPS-timestamped CCTV, δ_drift ≤ 10 ms; for unsynchronised dashcam recordings, δ_drift ≤ 50 ms. This procedure will be formally validated and its achievable precision characterised in Phase 3 of the validation programme.
The throw-distance coefficients in
Table 2 are author-derived by fitting
d = k · v^{1.5} to published empirical data. Formal residual analysis, confidence intervals, and independent FEA re-estimation remain reserved for Phase 1. The deterministic values (k = 0.041, 0.033, 0.018) are justified as a Phase 1 parameterisation because the sensitivity analysis (
Section 3.4) demonstrates that velocity reconstruction error is dominated by vehicle-class misassignment, not within-class coefficient precision, and because the values reproduce the mean empirical trajectories of [
1,
2] within the scatter of the original published data. No forensic deployment is proposed until Phase 5 validation is complete and a known error rate is established. No casework use, clinical interpretation, or legal admissibility assessment is supported or permitted by this manuscript until Phase 5 of the validation roadmap has been fully completed and a multi-site known error rate has been formally established.
6. Discussion
6.1. System Performance and Distinguishing Contributions
Three specific contributions distinguish this framework from existing approaches to collision kinematics reconstruction. First, the vehicle-class-parameterised throw distance model (
Table 2, Equation (10)) directly addresses the gap identified by Simms and Wood [
1]. A class-agnostic model produces systematic velocity reconstruction errors of practical consequence—particularly for cab-over trucks, which produce the shortest throw distances relative to impact speed and the most contested reconstruction scenarios. Second, the end-to-end Monte Carlo uncertainty propagation from Kalman posterior covariance through biomechanical signal transformation parameter variability to final reconstruction output provides a complete, stage-resolved uncertainty budget, which, to the authors’ knowledge, is the first such budget proposed for this reconstruction problem. Third, the acoustic spectral feature extraction channel—adapted from the validated MFCC-CNN architecture of Rezaul et al. [
6]—enables bimodal sensor fusion, providing a temporal physiological state dimension from existing scene recordings without requiring new data collection infrastructure.
The sensitivity analysis indicates that systematic parameter errors dominate reconstruction uncertainty, supporting the architectural emphasis on parameterised modelling and noise-controlled differentiation. The HIC noise-sensitivity results (
Table 8) confirm that Savitzky–Golay pre-filtering is theoretically justified for high-frequency noise suppression as a design requirement, not merely a smoothing convenience.
6.2. Relationship to Existing Sensor Fusion and Reconstruction Methods
The present framework complements, rather than replaces, established reconstruction tools. Its practical value is greatest when scene recordings of sufficient quality are available and when specific reconstructive questions—velocity estimation with uncertainty, pre-impact posture classification, and acoustic physiological state detection—are required. The framework is most effective in two specific operational contexts: cases where the absence of a cooperative witness makes objective kinematic reconstruction from scene evidence the only available basis for establishing vehicle speed and pedestrian posture at impact; and cases in which specific aspects of the collision sequence are contested, requiring physics-grounded, auditable outputs with explicit uncertainty bounds.
Recent work by Mohamed and Ahmed [
7,
8,
9] employs computer-vision-based pedestrian detection [
7], multi-camera trajectory reconstruction [
8], and vehicle–pedestrian conflict analysis at signalised intersections [
9] for proactive pedestrian safety assessment. These systems use surveillance cameras, CNN-based pose estimation, multi-camera field-of-view integration, and trajectory transformation to detect conflicts before collision occurs, optimising for detection latency and false-positive rate under deterministic point-estimate outputs. The present framework addresses the inverse temporal problem, which involves retrospective forensic reconstruction from post-impact scene observables. This distinction is methodological, not merely operational. Prospective systems require real-time inference with bounded latency; retrospective reconstruction must optimise for uncertainty quantification and physical interpretability under incomplete, noisy data. The two approaches are therefore complementary rather than competing, serving distinct phases of the traffic-safety lifecycle (pre-collision warning vs. post-collision adjudication), as comparatively summarised in
Table 11.
6.3. Emerging Opportunities and Key Challenges
The present research opens several substantive opportunities for the broader fields of forensic biomechanics, traffic-safety engineering, and legal medicine. Three are particularly salient. First, the vehicle-class-parameterised uncertainty propagation architecture established here provides a replicable template for embedding auditable uncertainty budgets into forensic reconstruction workflows, an approach currently absent from standard practice. As evidence standards in traffic-accident litigation increasingly require explicit error characterisation, a physics-grounded framework that reports velocity estimates with stage-resolved confidence intervals rather than point values represents a timely and practically consequential contribution. Second, the bimodal integration of kinematic and acoustic channels introduces a new dimension of forensic evidence—recoverable from infrastructure already present at most collision scenes (dashcam recordings, roadside CCTV)—without requiring bespoke hardware deployment. This opens a pathway to retrospective physiological state classification in unwitnessed fatality cases where the distinction between sudden collapse and alert pedestrian posture carries direct legal significance, including hit-and-run adjudication and disputed liability scenarios. Third, the modular five-phase validation roadmap, structured to accumulate evidence from FEA simulation through independent multi-site replication, offers a transferable model for phased validation of sensor-fusion frameworks in safety-critical domains more broadly. This structure is itself a methodological contribution that may inform validation design in adjacent fields, such as proactive safety monitoring and clinical biomechanical validation.
Against these opportunities, several interconnected challenges must be acknowledged. The most immediate is the domain-shift problem in acoustic classification, in which MFCC-CNN architectures validated on controlled audio corpora (clean recordings, consistent sampling conditions) may exhibit significant performance degradation when applied to CCTV or dashcam recordings subject to lossy codec compression, ambient noise, and distance attenuation. Phase 3 of the validation roadmap is specifically designed to characterise this degradation, but until those experiments are complete, acoustic classification accuracy in real forensic contexts remains an open empirical question. A related challenge concerns retrospective temporal synchronisation, in which aligning the acoustic and kinematic channels from independently operated recording devices introduces clock-offset uncertainty (δ_drift ≤ 50 ms for unsynchronised dashcam recordings, as noted in
Section 4) that must be formally bounded before bimodal outputs can be treated as jointly reliable. Second, the Kalman filter’s theoretical optimality is contingent on Gaussian noise and linear kinematic assumptions that may not hold for non-linear pedestrian motions, missing landmark detections, or partial scene occlusion—conditions that are common in real collision footage. The Monte Carlo uncertainty propagation partially accounts for this, but the residual gap between simulated and field performance remains to be characterised experimentally. Third, scaling the vehicle-class parameterisation beyond the three classes defined here (sedan, SUV, cab-over) to encompass motorcycles, e-scooters, heavy goods vehicles, and mixed-traffic scenarios will require additional empirical data collection and coefficient re-estimation. The current Phase 1 k-coefficient confidence intervals (
Table 2) are sufficiently tight for proof-of-concept purposes but will need narrowing through Phase 1 FEA calibration before they are suitable for casework. Finally, translating the pathway from a validated reconstruction framework into legally admissible forensic evidence involves regulatory, institutional, and ethical dimensions that lie beyond the scope of the technical validation roadmap, including professional accreditation, standardised reporting templates, jurisdictional admissibility requirements, and known error rate thresholds defined by the receiving legal system. These challenges are not unique to the present framework but are sharpened by the quantitative, probabilistic nature of its outputs, which will require clear expert-witness communication guidelines to be developed in parallel with the technical validation programme. Critically, the five-phase roadmap is itself a structural response to this admissibility challenge; by requiring peer-reviewed validation, multi-site replication, and formally established error rates before any forensic deployment, the framework is designed for admissibility from the ground up rather than having legal requirements retrofitted to an already-finalised technical system.
7. Conclusions
This paper presented the Phase 1 design of a physics-grounded multi-modal sensor fusion framework for the forensic kinematic reconstruction of pedestrian–vehicle collisions. The framework is designed to produce three outputs that collectively address key reconstructive questions: a calibrated impact velocity estimate with a propagated 95% confidence interval [
1,
4]; a kinematic pre-impact posture classification intended to distinguish an upright pedestrian from a lying victim from a medically compromised pedestrian [
4]; and a conceptual acoustic physiological state probability, recoverable from CCTV or dashcam audio once Phase 3 validation is complete [
6]. The central aim—to reframe forensic collision reconstruction as a formal state-estimation problem and provide a principled, uncertainty-aware answer to the question of what happened in the seconds surrounding an impact—motivated each architectural decision in the framework. This connection between aim and architecture is worth making explicit—the Kalman filter was selected because the reconstruction problem is precisely a state-estimation problem under Gaussian noise [
11]; Savitzky–Golay differentiation was selected because the noise-sensitivity of downstream biomechanical power-law transforms (quantified at a greater-than-quarter increase in HIC
15 per 10% noise perturbation;
Table 8) demands noise-optimal pre-filtering as a design requirement, not a cosmetic choice; vehicle-class parameterisation was incorporated because a class-agnostic model produces systematic velocity reconstruction errors (+36.1 km/h for a cab-over scenario) large enough to alter forensic conclusions entirely [
1]; and the acoustic channel was added because pre-impact physiological state—the distinction between a pedestrian who was alert, medically compromised, or already fallen—is recoverable from infrastructure already present at the scene but entirely absent from the post-mortem and scene-measurement evidence on which current reconstruction practice depends.
The originality of this contribution lies in the integration of three capabilities that have not previously been combined in the forensic biomechanics literature: (i) end-to-end Monte Carlo uncertainty propagation from Kalman posterior covariance through stage-resolved biomechanical signal transformations to a final reconstruction output with explicit 95% confidence bounds; (ii) vehicle-class-parameterised throw-distance inversion that directly addresses the reconstruction gap identified by Simms and Wood [
1] and quantifies the forensic cost of class-agnostic modelling; and (iii) a bimodal kinematic–acoustic architecture that extends the validated sensor platform of Barua and Hitosugi [
5] into the retrospective reconstruction domain, introducing physiological state as a temporally resolved forensic evidence dimension. No prior framework in this domain has simultaneously addressed all three. The sensitivity analysis (
Section 3.4) demonstrates that the framework produces bounded, interpretable uncertainty under realistic input variation—a property that existing expert-pattern-recognition approaches to collision reconstruction do not formally provide.
The added value of this work to knowledge and practice is threefold. In terms of knowledge, the stage-wise uncertainty budget (
Table S2) provides, to the authors’ knowledge, the first formally decomposed uncertainty accounting for a pedestrian collision reconstruction pipeline—separating sensor noise contributions (Stage 1, 33.7% of total variance) from biomechanical parameter variability (Stage 2, 64.9%) in a way that reveals where investment in measurement precision will yield the greatest reduction in reconstruction uncertainty. In terms of methodology, the physics-constrained feature representation (32-feature vector,
Table S3, structured by signal-processing origin rather than statistical convenience) enables physically interpretable SHAP attributions in downstream classification—an advance over black-box feature selection that is directly relevant to any evidentiary context requiring an auditable, independently evaluable basis for expert opinion. In terms of practice, the five-phase validation roadmap (
Table 9) provides a structured, evidence-accumulation programme that carries the framework from proof-of-concept simulation to forensic-deployment readiness through independent multi-site replication—ensuring that no operational forensic use is proposed before a known error rate has been formally established. This roadmap directly addresses the practical considerations that underpin forensic admissibility—reproducibility, known error rate, and peer-reviewed validation—as required by standards equivalent to those codified in Daubert and analogous jurisdictional frameworks. The practical implication for forensic investigators and legal professionals is that this framework, once Phase 5 is complete, would offer a physics-justified, auditable, and probabilistically calibrated reconstruction report—a paradigmatic advance over the current practice of presenting velocity estimates as point values without formal uncertainty characterisation.
The broader impact of this research, contingent on completion of the validation roadmap, extends across three communities. For forensic pathologists and accident reconstruction specialists, the framework offers a systematic, physics-grounded basis for addressing specific contested reconstruction questions—disputed vehicle speed, pedestrian posture at impact, and physiological state—with explicitly bounded uncertainty rather than expert intuition alone. For traffic-safety engineers and policymakers, the vehicle-class sensitivity analysis (
Section 3.4) and the bimodal architecture provide a diagnostic basis for understanding how vehicle-front geometry and sensing infrastructure interact to determine reconstruction reliability, informing both vehicle design standards and roadside infrastructure policy. For the sensor fusion and signal-processing community, the framework demonstrates how physics-justified architectural constraints—vehicle-class coefficients, cadaveric biomechanical thresholds, and noise-optimal differentiation—can be directly encoded into a machine learning input representation to produce a system whose outputs are both data-driven and physically interpretable. The five-phase validation roadmap (
Table 9) defines the full empirical programme required to advance this framework from simulation to operational deployment. No forensic deployment is proposed until Phase 5 is complete and a multi-site known error rate has been formally established. Collaboration from specialists in sensor engineering, signal processing, forensic biomechanics, and legal medicine is invited to advance this methodology through its validation programme. The present manuscript reports only theoretical and simulation-based framework behaviour under controlled assumptions and must not be interpreted as evidence of operational forensic accuracy.