Advanced Machine Learning Techniques for Energy Consumption Analysis and Optimization at UBC Campus: Correlations with Meteorological Variables
Abstract
:1. Introduction
2. Literature Review and Background
3. Research Methodology
3.1. Data Collection
3.2. Data Pre-Processing
- Time-Dependent Data Frames: Capturing temporal patterns in energy usage.
- Non-Time-Dependent Data Frames: Providing static information about building characteristics.
- Weather Data Frames: Offering environmental context, including meteorological conditions.
- Electricity Energy Data Frames: Detailing power usage across the campus.
- Gas Volume Data Frames: Indicating gas consumption patterns.
- Hot Water Energy Data Frames: Tracking hot water usage.
- Steam Volume Data Frames: Recording steam consumption.
- Water Volume Data Frames: Monitoring general water usage.
- Total Data Frames: Consolidating comprehensive building energy data for each building in the target year 2023.
3.3. Model Testing and Evaluation
- R-squared (R2): This statistic measures the proportion of variance in the dependent variable (energy consumption) explained by the independent variables (features) in the model. It ranges from 0 to 1, with higher values indicating a better fit between the model and data. A higher R-squared value means that the model accounts for a more significant proportion of the variance in the target variable, reflecting better explanatory power. Despite its benefits, R2 should be interpreted cautiously and in conjunction with other metrics to avoid overestimating model performance. Similar studies, including [7], have employed R2 to understand the model’s ability to capture the variability in energy consumption.
- Mean Absolute Error (MAE): MAE quantifies the average magnitude of errors between actual and predicted values, providing a clear indication of model accuracy. For instance, similar studies by [3,20] utilized MAE to benchmark model performance in energy forecasting tasks. Lower MAE values signify more accurate predictions, indicating that the model predictions are closer to the actual values. The advantages of MAE are its simplicity and ease of interpretation; however, it does not penalize larger errors more than smaller ones.
- Root Mean Square Error (RMSE): RMSE measures the square root of the average squared difference between actual and predicted values. This metric emphasizes more significant errors than MAE, as it penalizes larger discrepancies more heavily. Lower RMSE values indicate better model performance, with minor deviations from the actual values. Its application in previous works, such as [21], underscores its effectiveness in evaluating model accuracy, although it shares the disadvantage of sensitivity to outliers.
4. Results
4.1. Correlation between the Parameters
4.2. Electrical Energy
4.3. Hot Water Power
4.4. Gas Volume
4.5. Synthesis of the Predictive Accuracy for the Different Data Sets
4.6. Analysis of Training and Test Loss Graphs
- 1.
- For the Electrical Energy Model:
- 2.
- Hot Water Model
- 3.
- Gas Volume Model
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- United Nations Environment Programme. 2020 Global Status Report for Buildings and Constructions: Towards a Zero-emission, Efficient and Resilient Buildings and Constructions Sector; United Nations Environmental Programme: Nairobi, Kenya, 2020. [Google Scholar]
- Hung, N.T. Data-driven predictive models for daily electricity consumption of academic buildings. Aims Energy 2020, 8, 783–801. [Google Scholar]
- Han, B.; Zhang, S.; Qin, L.; Wang, X.; Liu, Y.; Li, Z. Comparison of support vector machine, Gaussian process regression and decision tree models for energy consumption prediction of campus buildings. In Proceedings of the 2022 8th International Conference on Hydraulic and Civil Engineering: Deep Space Intelligent Development and Utilization Forum (ICHCE), Xi’an, China, 25–27 November 2022; IEEE: New York, NY, USA; pp. 689–693. [Google Scholar]
- Shahcheraghian, A.; Madani, H.; Ilinca, A. From white to black-box models: A review of simulation tools for building energy management and their application in consulting practices. Energies 2024, 17, 376. [Google Scholar] [CrossRef]
- Ding, Y.; Liu, X. A comparative analysis of data-driven methods in building energy benchmarking. Energy Build. 2020, 209, 109711. [Google Scholar] [CrossRef]
- Amber, K.P.; Aslam, M.W.; Mahmood, A.; Kousar, A.; Younis, M.Y.; Akbar, B.; Chaudhary, G.Q.; Hussain, S.K. Energy consumption forecasting for university sector buildings. Energies 2017, 10, 1579. [Google Scholar] [CrossRef]
- Sadeghian Broujeny, R.; Ben Ayed, S.; Matalah, M. Energy Consumption Forecasting in a University Office by Artificial Intelligence Techniques: An Analysis of the Exogenous Data Effect on the Modeling. Energies 2023, 16, 4065. [Google Scholar] [CrossRef]
- Lei, L.; Chen, W.; Wu, B.; Chen, C.; Liu, W. A building energy consumption prediction model based on rough set theory and deep learning algorithms. Energy Build. 2021, 240, 110886. [Google Scholar] [CrossRef]
- Fan, C.; Xiao, F.; Zhao, Y. A short-term building cooling load prediction method using deep learning algorithms. Appl. Energy 2017, 195, 222–223. [Google Scholar] [CrossRef]
- Chou, J.-S.; Tran, D.-S. Forecasting energy consumption time series using machine learning techniques based on usage patterns of residential householders. Energy 2018, 165, 709–726. [Google Scholar] [CrossRef]
- Jarquin, C.S.S.; Gandelli, A.; Grimaccia, F.; Mussetta, M. Short-Term Probabilistic Load Forecasting in University Buildings by Means of Artificial Neural Networks. Forecasting 2023, 5, 390–404. [Google Scholar] [CrossRef]
- Gelažanskas, L.; Gamage, K.A. Forecasting hot water consumption in residential houses. Energies 2015, 8, 12702–12717. [Google Scholar] [CrossRef]
- Liu, J.; Wang, S.; Wei, N.; Chen, X.; Xie, H.; Wang, J. Natural gas consumption forecasting: A discussion on forecasting history and future challenges. J. Nat. Gas Sci. Eng. 2021, 90, 103930. [Google Scholar] [CrossRef]
- UBC. Building Energy and Water Data. Available online: https://energy.ubc.ca/energy-and-water-data/skyspark/ (accessed on 22 April 2024).
- UBC. SKy Spark. Available online: https://energy.ubc.ca/projects/skyspark/ (accessed on 22 April 2024).
- UBC. University. Building Management Systems (BMS). Available online: https://energy.ubc.ca/ubcs-utility-infrastructure/building-management-systems-bms/ (accessed on 22 April 2024).
- UBC. UBC Adress Map. Available online: https://planning.ubc.ca/about-us/campus-maps (accessed on 21 June 2024).
- Brownlee, J. How to Decompose Time Series Data into Trend and Seasonality. Available online: https://machinelearningmastery.com/decompose-time-series-data-trend-seasonality/ (accessed on 25 April 2024).
- Venujkvenk. Exploring Time Series Data: Unveiling Trends, Seasonality, and Residuals. Available online: https://medium.com/@venujkvenk/exploring-time-series-data-unveiling-trends-seasonality-and-residuals-5cace823aff1 (accessed on 25 April 2024).
- Wu, W.; Deng, Q.; Shan, X.; Miao, L.; Wang, R.; Ren, Z. Short-Term Forecasting of Daily Electricity of Different Campus Building Clusters Based on a Combined Forecasting Model. Building 2023, 13, 2721. [Google Scholar] [CrossRef]
- Dong, W.; Sun, H.; Li, Z.; Yang, H. Design and optimal scheduling of forecasting-based campus multi-energy complementary energy system. Energy 2024, 309, 133088. [Google Scholar] [CrossRef]
- UBC. Hot Water District Energy System. Available online: https://energy.ubc.ca/ubcs-utility-infrastructure/district-energy-hot-water/ (accessed on 12 July 2024).
Model | Best Parameters |
---|---|
Decision Tree Regressor | {‘max_depth’: 20, ‘min_samples_split’: 5, ‘min_samples_leaf’: 2, ‘max_features’: ‘sqrt’} |
Random Forest Regressor | {‘n_estimators’: 100, ‘max_depth’: 20, ‘min_samples_split’: 2, ‘min_samples_leaf’: 4, ‘max_features’: ‘auto’} |
Gradient Boosting Regressor | {‘n_estimators’: 100, ‘learning_rate’: 0.1, ‘max_depth’: 5, ‘subsample’: 0.9} |
AdaBoost Regressor | {‘n_estimators’: 100, ‘learning_rate’: 0.1, ‘loss’: ‘linear’} |
Linear Regression | {‘fit_intercept’: True, ‘normalize’: False} |
Ridge Regression | {‘alpha’: 1.0, ‘solver’: ‘auto’} |
Lasso Regression | {‘alpha’: 0.1, ‘selection’: ‘cyclic’} |
Support Vector Regression | {‘C’: 1.0, ‘epsilon’: 0.1, ‘kernel’: ‘rbf’, ‘degree’: 3} |
K-Neighbors Regressor | {‘n_neighbors’: 5, ‘weights’: ‘uniform’, ‘p’: 2, ‘algorithm’: ‘auto’} |
Category | Hyperparameter | Value/Description |
---|---|---|
Layer Structure | First Layer | Dense layer with 128 neurons, ‘tanh’ activation function |
Second Layer | Dense layer with 64 neurons, ‘tanh’ activation function | |
Third Layer | Dense layer with 64 neurons, ‘tanh’ activation function | |
Fourth Layer | Dense layer with 32 neurons, ‘tanh’ activation function | |
Output Layer | Dense layer with 1 neuron for regression output | |
Activation Function | Activation Function | ‘tanh’ (used in all hidden layers) The ‘tanh’ activation function was chosen for its ability to introduce nonlinearity into the model, which is crucial for capturing complex patterns in energy consumption data. It outputs values between −1 and 1, centering the data and helping to address the vanishing gradient problem. Additionally, ‘tanh’ handles negative inputs effectively, which aligns with the characteristics of our dataset. This choice helps improve the model’s learning dynamics and performance. |
Regularization | Dropout Rate | 0.4 (applied after each BatchNormalization layer) |
Kernel Regularizer | L2 regularization with factor 0.01 (applied to the weights of the second, third, and fourth Dense layers) | |
Optimization | Optimizer | Nadam |
Learning Rate | 0.0005 | |
Learning Rate Scheduler | ReduceLROnPlateau (reduces learning rate by 0.5 if validation loss does not improve for 10 epochs, minimum learning rate set to 1 × 10−6) | |
Training Configuration | Batch Size | 16 |
Epochs | 100 | |
Callbacks | Custom Callback | TestLossCallback (tracks test loss at the end of each epoch) |
Metric Pair | Correlation | Interpretation |
---|---|---|
Electricity Energy and Hot Water Power | 0.38 (Moderate Positive) | Increases in electricity usage are moderately associated with increases in hot water power consumption. |
Electricity Energy and Gas Volume | 0.25 (Weaker Positive) | Weaker positive relationship; changes in electricity usage have a less pronounced effect on gas volume. |
Electricity Energy and Steam Volume | 0.31 (Moderate Positive) | Moderate positive correlation; notable relationship between electricity usage and steam volume. |
Hot Water Power and Steam Volume | 0.75 (Strong Positive) | Strong positive correlation and significant relationship due to UBC’s medium-temperature hot water system. |
Hot Water Power and Gas Volume | 0.34 (Moderate Positive) | Moderate positive relationship between hot water power consumption and gas volume. |
Gas Volume and Steam Volume | 0.49 (Moderate Positive) | Moderate positive correlation; considerable relationship between gas and steam volumes. |
Seasonal Effects and Temperature | 0.19 (Weak Positive) | Weak positive correlation; mild influence of seasonal variations on temperature fluctuations. |
Temperature and Hot Water Power | −0.69 (Strong Negative) | Strong negative correlation; warmer temperatures are associated with reduced hot water power usage. |
Temperature and Steam Volume | −0.60 (Strong Negative) | Strong negative correlation; warmer temperatures are associated with reduced steam volume. |
Temperature and Electricity Energy | −0.27 (Moderate Negative) | Moderate negative correlation; warmer temperatures are associated with a noticeable reduction in electricity use. |
Model | Mean Absolute Error (MAE) | Coefficient of Determination (R2) |
---|---|---|
Deep Neural Networks (DNN) | 0.15 | 0.98 |
Decision Tree Regressor | 0.18 | 0.92 |
Random Forest Regressor | 0.2 | 0.96 |
Gradient Boosting Regressor | 1.03 | 0.5 |
AdaBoost Regressor | 1.22 | 0.34 |
Linear Regression | >1 | Lower values |
Ridge Regression | >1 | Lower values |
Lasso Regression | >1 | Lower values |
Support Vector Regression | >1 | Lower values |
K-Neighbors Regressor | >1 | Lower values |
Model | MAE | R2 |
---|---|---|
Neural Network | 3570 | 0.88 |
Decision Tree Regressor | 2038.71 | 0.89 |
Random Forest Regressor | 1852.85 | 0.91 |
Gradient Boosting Regressor | 2660.06 | Moderate |
AdaBoost Regressor | 3986.78 | Moderate |
Linear Regression | ~5298.00 | ~0.51 |
Ridge Regression | ~5298.00 | ~0.51 |
Lasso Regression | ~5298.00 | ~0.51 |
Support Vector Regression | 8222.55 | Lowest |
KNeighbors Regressor | 2413.37 | 0.88 |
Model | MAE | R2 |
---|---|---|
Neural Network | 28.82 | 0.96 |
Random Forest Regressor | 33.51 | 0.91 |
Decision Tree Regressor | 37.17 | 0.86 |
Gradient Boosting Regressor | 57.84 | 0.55 |
AdaBoost Regressor | 67.17 | 0.44 |
Linear Regression | ~82.00 | ~0.18 |
Ridge Regression | ~82.00 | ~0.18 |
Lasso Regression | ~82.00 | ~0.18 |
Support Vector Regression | 88.39 | 0.13 |
KNeighbors Regressor | 54.32 | 0.57 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Shahcheraghian, A.; Ilinca, A. Advanced Machine Learning Techniques for Energy Consumption Analysis and Optimization at UBC Campus: Correlations with Meteorological Variables. Energies 2024, 17, 4714. https://doi.org/10.3390/en17184714
Shahcheraghian A, Ilinca A. Advanced Machine Learning Techniques for Energy Consumption Analysis and Optimization at UBC Campus: Correlations with Meteorological Variables. Energies. 2024; 17(18):4714. https://doi.org/10.3390/en17184714
Chicago/Turabian StyleShahcheraghian, Amir, and Adrian Ilinca. 2024. "Advanced Machine Learning Techniques for Energy Consumption Analysis and Optimization at UBC Campus: Correlations with Meteorological Variables" Energies 17, no. 18: 4714. https://doi.org/10.3390/en17184714
APA StyleShahcheraghian, A., & Ilinca, A. (2024). Advanced Machine Learning Techniques for Energy Consumption Analysis and Optimization at UBC Campus: Correlations with Meteorological Variables. Energies, 17(18), 4714. https://doi.org/10.3390/en17184714