Abstract
Accurate forecasting of cooling loads is essential for the effective operation of Building Energy Management Systems (BEMSs) and the reduction of building-sector carbon emissions. Although Artificial Neural Networks (ANNs), particularly Multi-Layer Perceptrons (MLPs), have shown strong capability in modeling nonlinear thermal dynamics, their reliability in practice is often limited by inappropriate training algorithm selection and poor data quality, including missing values and numerical distortions. To address these limitations, this study conducts a comprehensive empirical investigation into the effects of training algorithms and systematic data preprocessing strategies on cooling load prediction performance using an MLP model. Through benchmarking ten distinct training algorithms under identical conditions, the Levenberg–Marquardt (LM) algorithm was identified as achieving the lowest prediction error when integrated data preprocessing was applied. In particular, the application of data preprocessing reduced the CvRMSE from 18.56% to 6.03% during the testing period. Furthermore, the proposed framework effectively mitigated zero-load prediction errors during non-cooling periods and improved prediction accuracy under high-load operating conditions. These results provide practical and quantitative guidance for developing robust data-driven forecasting models applicable to real-time building energy optimization.