A Hybrid AI-Stochastic Framework for Predicting Dynamic Labor Productivity in Sustainable Repetitive Construction Activities
Abstract
1. Introduction
2. Literature Review
2.1. Review of the Labor Productivity Forecasted Model
2.2. Review of the Integration of the Markov Chain with LSTM
2.3. Research Gap
- Real-Time Adaptability Deficit: Current models rely on static historical data and assumptions, failing to dynamically adapt to evolving site conditions, limiting real-time decision support during active construction.
- Cold-Start Problem: Existing approaches require substantial project-specific data for reliable predictions, creating critical forecasting gaps during early project stages when productivity uncertainties are highest and managerial interventions most impactful.
- Insufficient Proactive Management: Conventional methods provide macro-level, delayed assessments that are inadequate for identifying and mitigating productivity losses as they occur, necessitating fine-grained, instantaneous feedback for data-driven, on-the-spot adjustments.
3. Methodology
3.1. Prepare Data
3.1.1. Data Acquisition and Preprocessing
3.1.2. Perform State Transformation Using Percentile Threshold
3.1.3. Establish Transition Probability Matrix (TPM)
3.2. Design a Hybrid MC-LSTM Model
3.2.1. Time Series Processing with LSTM Layer
3.2.2. Markov State Input and Integration
3.2.3. Output Layers
3.2.4. Numerical Illustration of the Hybrid MC-LSTM Prediction Process
- To predict Day 5, the model uses data up to Day 4, along with their corresponding state classifications for Days 1 through 4.
- For TPM Selection, with only 4 weeks of data, which is less than the number of rolling windows, which is set as 12, the model defaults to using the (). The prior TPM value was pre-calculated from the entire historical project dataset (a project similar to the current one), and it provides a probability matrix for the (cold start) transition. The prior TPM value will be used directly with the LSTM output. The is shown in Table 3.
- The feature extractions were implemented as time-series features and Marko features.
- Regarding time series features, the last N_STEPS = 3 average productivity values are extracted. For day 5, these would be the values from Days 2, 3, and 4: [0.135, 0.128, 0.131]. These raw values are then scaled using the minimum and maximum values fitted to the historical training data and reshaped for the LSTM input layer.
- Regarding Markov features, the state of the last available day (Day 4) determines the row to select from the chosen TPM. In this example, the state for Day 4 is (S1:Low). Therefore, the Markov features will be the transition probabilities from (S1:Low) as found in the : [0.80, 0.15, 0.05], as shown in Table 2. This vector is reshaped for the hybrid model’s input.
- The scaled time-series features and Markov features are fed into the trained MC-LSTM model. The LSTM processes the time series, and its output is concatenated with the Markov features. This combined input then passes through dense layers, culminating in a single predicted average productivity value for Day 5.
3.3. Perform Model Validation
4. Results and Discussion
4.1. Description of the Collected Case Studies
4.2. Assessment of the Hybrid Model for Training and Testing Data
4.3. Performance Comparison of MC-LSTM and LSTM Models
4.4. Accuracy Comparison with Previous Study
4.5. Performance Hybrid Model
4.6. Sensitivity Analysis
5. Limitations of the Study and Future Work
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Appendix A. LSTM Gates
Appendix B. Model Implementation and Configuration
Appendix B.1. LSTM Layer and Input Configuration
Appendix B.2. Post-Concatenation Network Architecture
- Dense Layer 1: Contains 25 neurons with ReLU activation (weight matrix W1: 27 × 25, bias b1: 25 × 1), yielding 700 trainable parameters.
- Dense Layer 2 (Output Layer): Contains one neuron with linear activation (W2: 25 × 1, b2: scalar), yielding 26 parameters.
Appendix B.3. Software and Training Specifics
References
- United Nations Environment Programme. 2019 Global Status Report for Buildings and Construction Towards a Zero-Emissions, Efficient and Resilient Buildings and Constructi on Sector. 2019. Available online: https://globalabc.org/sites/default/files/2020-03/GSR2019.pdf (accessed on 30 October 2025).
- BuiltFront. How to Calculate Labor Cost in Construction (Formula & Easy Guide). 2025. Available online: https://builtfront.com/blog/construction-labor-cost/ (accessed on 4 August 2025).
- Gonzalez, K. Labor Costs in Construction: A Comprehensive Guide. Available online: https://www.workyard.com/construction-management/construction-labor-costs (accessed on 3 September 2025).
- Goolsbee, A.; Syverson, C. The Strange and Awful Path of Productivity in the US Construction Sector; National Bureau of Economic Research: Cambridge, MA, USA, 2023. [Google Scholar]
- Mischke, J.; Stokvis, K.; Vermeltfoort, K. Delivering on Construction Productivity is no Longer Optional; McKinsey & Company: New York, NY, USA, 2024. [Google Scholar]
- Khanh, H.D.; Kim, S.Y. Determining labor productivity diagram in high-rise building using straight-line model. KSCE J. Civ. Eng. 2014, 18, 898–908. [Google Scholar] [CrossRef]
- Portas, J.; Abourizk, S. Neural network model for estimating construction productivity. J. Constr. Eng. Manag. 1997, 123, 399–410. [Google Scholar] [CrossRef]
- Heravi, G.; Asce, M.; Eslamdoost, E. Applying Artificial Neural Networks for Measuring and Predicting Construction-Labor Productivity. J. Constr. Eng. Manag. 2015, 141, 04015032. Available online: https://ascelibrary.org/doi/full/10.1061/%28ASCE%29CO.1943-7862.0001006?casa_token=gqIKXp9yx7AAAAAA%3AiEGGqWLKZZvIpntGjTrbYq3APdyb3Yd0Jiy57YmIkzBCeywwNc3LcZKu4c_1rqa7B9wEu2yb9vM (accessed on 25 September 2025).
- Kim, H.; Lee, H.-S.; Park, M.; Ahn, C.R.; Hwang, S. Productivity Forecasting of Newly Added Workers Based on Time-Series Analysis and Site Learning. J. Constr. Eng. Manag. 2015, 141, 05015008. [Google Scholar] [CrossRef]
- Jacobsen, E.L.; Teizer, J.; Wandahl, S.; Brilakis, I. Probabilistic forecasting of construction labor productivity metrics. J. Inf. Technol. Constr. 2024, 24, 58–83. [Google Scholar] [CrossRef]
- Nasirzadeh, F.; Nojedehi, P. Dynamic modeling of labor productivity in construction projects. Int. J. Proj. Manag. 2013, 31, 903–911. [Google Scholar] [CrossRef]
- Rahal, M.; Khoury, H. A Mathematical Model for Quantifying Workers’ Learning Range on Repetitive Construction Projects. Periodica Polytechnica Budapest University of Technology and Economics. In Proceedings of the Creative Construction Conference 2019, Budapest, Hungary, 29 June–2 July 2019; pp. 540–546. [Google Scholar] [CrossRef]
- Dabirian, S.; Moussazadeh, M.; Khanzadi, M.; Abbaspour, S. Predicting the effects of congestion on labour productivity in construction projects using agent-based modelling. Int. J. Constr. Manag. 2023, 23, 606–618. [Google Scholar] [CrossRef]
- Ko, Y.; Kim, Y.; Noh, J.; Lee, K.; Shin, D.; Han, S. Methodology for construction-standard-production-rate-based simulation modeling and production-rate data generation. J. Asian Archit. Build. Eng. 2024, 23, 725–739. [Google Scholar] [CrossRef]
- Shanmuganathan, V.; Suresh, A. Markov enhanced I-LSTM approach for effective anomaly detection for time series sensor data. Int. J. Intell. Netw. 2024, 5, 154–160. [Google Scholar] [CrossRef]
- Sengupta, A.; Das, A.; Guler, S.I. Hybrid hidden Markov LSTM for short-term traffic flow prediction. arXiv 2023, arXiv:2307.04954. [Google Scholar] [CrossRef]
- Thomas, H.R.; ASCE, M.; Mathews, C.T.; Ward, J.G. Learning curve models of construction productivity. J. Constr. Eng. Manag. 1986, 112, 245–258. [Google Scholar] [CrossRef]
- Mirahadi, F.; Zayed, T. Simulation-based construction productivity forecast using neural-network-driven fuzzy reasoning. Autom. Constr. 2016, 60, 102–115. [Google Scholar] [CrossRef]
- Sepp, H.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
- Nguyen-Le, D.H.; Tao, Q.B.; Nguyen, V.H.; Abdel-Wahab, M.; Nguyen-Xuan, H. A data-driven approach based on long short-term memory and hidden Markov model for crack propagation prediction. Eng. Fract. Mech. 2020, 235, 107085. [Google Scholar] [CrossRef]
- Ren, C.; Gu, J.; Tian, S.; Zhou, J.; Shi, S.; Fu, Y. A Short-Term Rolling Prediction-Correction Method for Wind Power Output Based on LSTM and Markov Chain. In Proceedings of the 2021 IEEE IAS Industrial and Commercial Power System Asia, I and CPS Asia 2021, Chengdu, China, 18–21 July 2021; pp. 574–580. [Google Scholar] [CrossRef]
- Wang, P.; Wang, H.; Zhang, H.; Lu, F.; Wu, S. A hybrid markov and LSTM model for indoor location prediction. IEEE Access 2019, 7, 185928–185940. [Google Scholar] [CrossRef]
- Tadayon, M.; Pottie, G. Comparative Analysis of the Hidden Markov Model and LSTM: A Simulative Approach. arXiv 2020, arXiv:2008.03825. [Google Scholar] [CrossRef]
- Dougherty, J.; Kohavi, R.; Sahami, M. Supervised and unsupervised discretization of continuous features. In Proceedings of the 12th International Conference on Machine Learning, Tahoe City, CA, USA, 9–12 July 1995; pp. 194–202. [Google Scholar]
- Bishop, C.M. Neural Networks for Pattern Recognition; Oxford University Press: Oxford, UK, 1995. [Google Scholar]
- Rolling-Window Analysis of Time-Series Models, “Math-Work”. MATLAB Help Center. Available online: https://www.mathworks.com/help/econ/rolling-window-estimation-of-state-space-models.html (accessed on 25 October 2025).
- Hwang, S.; Liu, L.Y. Proactive project control using productivity data and time series analysis. In Proceedings of the Computing in Civil Engineering, Cancun, Mexico, 12–15 July 2005. [Google Scholar]
- Montaño Moreno, J.J.; Palmer Pol, A.; Sesé Abad, A.; Cajal Blasco, B. Using the R-MAPE index as a resistant measure of forecast accuracy. Psicothema 2013, 25, 500–506. [Google Scholar] [CrossRef]
- Siami-Namini, S.; Tavakoli, N.; Namin, A.S. The performance of LSTM and BiLSTM in forecasting time series. In Proceedings of the 2019 IEEE International Conference on Big Data (Big Data), Los Angeles, CA, USA, 9–12 December 2019; pp. 3285–3292. [Google Scholar]
- Song, Y.; Cai, C.; Ma, D.; Li, C. Modelling and forecasting high-frequency data with jumps based on a hybrid nonparametric regression and LSTM model. Expert Syst. Appl. 2024, 237, 121527. [Google Scholar] [CrossRef]
- Wang, C.; Li, X.; Shi, Y.; Jiang, W.; Song, Q.; Li, X. Load forecasting method based on CNN and extended LSTM. Energy Rep. 2024, 12, 2452–2461. [Google Scholar] [CrossRef]
- Han, Y.; Du, Z.; Geng, Z.; Fan, J.; Wang, Y. Novel long short-term memory neural network considering virtual data generation for production prediction and energy structure optimization of ethylene production processes. Chem. Eng. Sci. 2023, 267, 118372. [Google Scholar] [CrossRef]
- Jeong, J.; Jeong, J.; Lee, J.; Kim, D.; Son, J. Learning-driven construction productivity prediction for prefabricated external insulation wall system. Autom. Constr. 2022, 141, 104441. [Google Scholar] [CrossRef]
- Jarkas, A.M.; Bitar, C.G. Factors affecting construction labor productivity in Kuwait. J. Constr. Eng. Manag. 2012, 138, 811–820. [Google Scholar] [CrossRef]
- Golnaraghi, S.; Zangenehmadar, Z.; Moselhi, O.; Alkass, S. Application of artificial neural network (s) in predicting formwork labour productivity. Adv. Civ. Eng. 2019, 2019, 5972620. [Google Scholar] [CrossRef]
- Tsehayae, A.A.; Fayek, A.R. System model for analysing construction labour productivity. Constr. Innov. 2016, 16, 203–228. [Google Scholar] [CrossRef]
- AbouRizk, S.M.; Halpin, D.W.; Wilson, J.R. Fitting beta distributions based on sample data. J. Constr. Eng. Manag. 1994, 120, 288–305. [Google Scholar] [CrossRef]
- Heravi, G.; Jafari, A. Cost of Quality Evaluation in Mass-Housing Projects in Developing Countries. J. Constr. Eng. Manag. 2014, 14, 04014004. [Google Scholar] [CrossRef]
- Graves, A. Generating Sequences With Recurrent Neural Networks. 2014. Available online: http://arxiv.org/abs/1308.0850 (accessed on 25 September 2025).
- Kalicinsky, C.; Reisch, R.; Knieling, P.; Koppmann, R. Determination of time-varying periodicities in unequally spaced time series of OH∗ temperatures using a moving Lomb-Scargle periodogram and a fast calculation of the false alarm probabilities. Atmos. Meas. Tech. 2020, 13, 467–477. [Google Scholar] [CrossRef]














| Model | Primary Mechanism | Key Advantage | Limitation | Reference |
|---|---|---|---|---|
| Learning Curve | Empirical/straight-line functions | Captures long-term learning trends in repetitive tasks | Assumes predictable learning; struggles with real-time disruptions/volatility | [17] |
| ANN | Multilayer perceptron | Handles complex, nonlinear factor relationships | Relies on broad historical data; lacks real-time site adaptation | [18] |
| ARIMA/Time-Series | Autoregressive and moving average components | Identifies temporal patterns; good for historical trend analysis | Struggles to adapt to sudden, unforeseen, real-time changes | [9] |
| General LSTM | Recurrent Neural Network (RNN) with gates | Excels at long-term temporal dependencies | Requires substantial time-series data; susceptible to the cold-start problem | [19] |
| HMM-LSTM Hybrids (General) | HMM for sequence/state, LSTM for long-term data | Learns with limited data (HMM); good for complementary features (traffic/sensor data) | Not explicitly tailored or validated for construction productivity, state-based volatility, and cold-start adaptation using Bayesian-smoothed TPM in a dual-input architecture | [16] |
| Proposed MC-LSTM | LSTM + Bayesian-adjusted Markov chain | Dual input captures temporal trend and probabilistic state transition (volatility) | Provides accurate, real-time predictions under data-limited (cold start) conditions |
| Day | Average Productivity | Day | Average Productivity |
|---|---|---|---|
| 1 | 0.154 | 11 | 0.172 |
| 2 | 0.135 | 12 | 0.192 |
| 3 | 0.128 | 13 | 0.128 |
| 4 | 0.132 | 14 | 0.114 |
| 5 | 0.118 | 15 | 0.125 |
| 6 | 0.109 | 16 | 0.120 |
| 7 | 0.128 | 17 | 0.139 |
| 8 | 0.132 | 18 | 0.128 |
| 9 | 0.139 | 19 | 0.154 |
| 10 | 0.147 | 20 | 0.167 |
| From/To | S1 | S2 | S3 |
|---|---|---|---|
| S1 | 0.8 | 0.15 | 0.05 |
| S2 | 0.187 | 0.687 | 0.125 |
| S3 | 0.142 | 0.428 | 0.428 |
| No. | Name Case Study | Description | Unit Time | Productivity Unit | Reference |
|---|---|---|---|---|---|
| 1 | First case study | The time series for the actual project, as reported by Kim et al. [9] consists of six high-rise buildings. Due to labor insensitivity, a formwork activity was selected as the case activity. | Work day | Man-hour/m2 | [9] |
| 2 | Second case study | The daily productivity of the brick-laying operation of a building construction site in Champaign, Illinois, was collected. | Work day | Square feet/Man-day | [27] |
| 3 | Third case study | Daily productivity of Steel erection was collected | Work day | Man-day/unit | [27] |
| 4 | Fourth case study | Daily productivity of six high-rise buildings from 24 November to 15 December | Work day | Man-hour/m2 | [9] |
| No. | Range | Description |
|---|---|---|
| 1 | MAPE ≤ 0.1 | Excellent |
| 2 | 0.1 < MAPE ≤ 0.2 | Good |
| 3 | 0.2 < MAPE ≤ 0.5 | Acceptable/Reasonable |
| 4 | MAPE > 0.5 | Poor |
| Metric | Case Study 1 | Case Study 2 | Case Study 3 | Case Study 4 | Average |
|---|---|---|---|---|---|
| Training Time (s) | |||||
| LSTM | 12.4 | 15.8 | 14.2 | 13.6 | 14.0 |
| MC-LSTM | 14.8 | 18.3 | 16.7 | 15.9 | 16.4 |
| Overhead | +2.4 (19.4%) | +2.5 (15.8%) | +2.5 (17.6%) | +2.3 (16.9%) | +2.4 (17.1%) |
| Inference Time per Prediction (ms) | |||||
| LSTM | 3.2 | 3.4 | 3.3 | 3.1 | 3.25 |
| MC-LSTM | 4.1 | 4.3 | 4.2 | 4.0 | 4.15 |
| Overhead | +0.9 (28.1%) | +0.9 (26.5%) | +0.9 (27.3%) | +0.9 (29.0%) | +0.9 (27.7%) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Alsanabani, N.; Al-Gahtani, K.; Altuwaim, A.; Bin Mahmoud, A. A Hybrid AI-Stochastic Framework for Predicting Dynamic Labor Productivity in Sustainable Repetitive Construction Activities. Sustainability 2025, 17, 11097. https://doi.org/10.3390/su172411097
Alsanabani N, Al-Gahtani K, Altuwaim A, Bin Mahmoud A. A Hybrid AI-Stochastic Framework for Predicting Dynamic Labor Productivity in Sustainable Repetitive Construction Activities. Sustainability. 2025; 17(24):11097. https://doi.org/10.3390/su172411097
Chicago/Turabian StyleAlsanabani, Naif, Khalid Al-Gahtani, Ayman Altuwaim, and Abdulrahman Bin Mahmoud. 2025. "A Hybrid AI-Stochastic Framework for Predicting Dynamic Labor Productivity in Sustainable Repetitive Construction Activities" Sustainability 17, no. 24: 11097. https://doi.org/10.3390/su172411097
APA StyleAlsanabani, N., Al-Gahtani, K., Altuwaim, A., & Bin Mahmoud, A. (2025). A Hybrid AI-Stochastic Framework for Predicting Dynamic Labor Productivity in Sustainable Repetitive Construction Activities. Sustainability, 17(24), 11097. https://doi.org/10.3390/su172411097

