Abstract
Newly hired workers in construction, industrialized building production, and other built-environment operations often face elevated safety and ergonomic risk while learning manual tasks. At the same time, many onboarding programs still rely on observation, verbal coaching, and checklist-based sign-off, which can be difficult to standardize across supervisors and sites. This study presents the development and field evaluation of a data-driven training system that integrates markerless motion capture, machine-learning-assisted ergonomic risk scoring, and Lean/continuous-improvement (CI) routines to provide structured coaching during onboarding. A single-site, non-randomized quasi-experimental sequential-cohort design compared a traditional onboarding cohort with a subsequent app-supported cohort (n = 20 each). Primary outcomes were time to qualification, training cost, and task accuracy. Secondary site-level indicators were safety compliance and musculoskeletal (MSK) injury outcomes. Compared with the traditional cohort, the app-supported cohort reached qualification sooner (5.85 ± 1.50 vs. 18.60 ± 3.50 calendar months), at lower cost (SR 29,250 ± 7602 vs. SR 93,000 ± 17,348 per employee), and with higher task accuracy (88.60 ± 5.70% vs. 60.65 ± 10.60%). Welch’s t-tests showed statistically significant differences across all primary outcomes (all p < 0.001), although the standardized effect sizes were very large and should be interpreted cautiously given the modest sample and non-randomized design. Safety compliance (+68%) and MSK injuries (−25%) are reported only as descriptive site-level indicators because denominator and exposure data were not available for inferential analysis. The study contributes a practical intervention model linking ergonomic sensing to coaching cues, auditable training logs, A3 problem solving, and standard work refinement. The findings suggest that this integrated approach is promising for built-environment onboarding, but multi-site studies with stronger comparative designs, individual-level reporting, and fuller algorithmic documentation are needed.