Next Article in Journal
Analysis of the Parameters of an Ecological Power Supply Wire System for Moving and Stabilising the Position of a Floating Dock
Previous Article in Journal
Considerations Regarding the Middle Power Asynchronous Motors for Railway Electrical Traction
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Research on Building Energy Consumption Prediction Based on Improved PSO Fusion LSSVM Model

by
Suli Zhang
1,*,
Yiting Chang
2,
Hui Li
3 and
Guanghao You
2
1
School of Computer Technology and Engineering, Changchun Institute of Technology, Changchun 130012, China
2
School of Energy and Power Engineering, Changchun Institute of Technology, Changchun 130012, China
3
School of Information and Control Engineering, Jilin Institute of Chemical Technology, Jilin 132022, China
*
Author to whom correspondence should be addressed.
Energies 2024, 17(17), 4329; https://doi.org/10.3390/en17174329
Submission received: 12 July 2024 / Revised: 13 August 2024 / Accepted: 28 August 2024 / Published: 29 August 2024
(This article belongs to the Section G: Energy and Buildings)

Abstract

In urban building management, accurate prediction of building energy consumption is significant in realizing energy conservation and improving energy efficiency. Due to the complexity and variability of energy consumption data, existing prediction models face the challenge of difficult parameter selection, which directly affects their accuracy and application. To solve this problem, this study proposes an improved particle swarm algorithm (IPSO) for optimizing the parameters of the least squares support vector machine (LSSVM) and constructing an energy consumption prediction model based on IPSO-LSSVM. The model fully combines the advantages of LSSVM in terms of nonlinear fitting and generalization ability and uses the IPSO algorithm to adjust the parameters precisely. By analyzing the sample data characteristics and validating them on two different types of building energy consumption datasets, the results of the study show that, compared with traditional baseline models such as back-propagation neural networks (BP) and support vector regression (SVR), the model proposed in this study is more accurate and efficient in parameter selection and significantly reduces the prediction error rate. This improved approach not only improves the accuracy of building energy consumption prediction but also enhances the robustness and adaptability of the model, which provides reliable methodological support for the development of more effective energy-saving strategies and optimization of energy use to achieve the goal of energy-saving and consumption reduction and provides a new solution for the future management of building energy consumption.

1. Introduction

Currently, major challenges such as the global energy crisis, climate change, and environmental degradation have prompted various countries to strengthen their research on energy conservation and emission reduction [1]. The International Energy Agency (IEA) reports that buildings use about 30% to 40% of the world’s energy [2,3,4], and the improvement of energy efficiency in buildings not only reduces the negative impact on the environment but also significantly reduces energy costs and improves economic efficiency [5]. Therefore, improving the energy efficiency of buildings is considered a key solution to meet these challenges. The need for high energy efficiency in buildings is mainly characterized by reduced energy consumption, lower operating costs, and improved comfort and productivity. The improvement of energy efficiency in buildings is inextricably linked to the advancement of modern technologies and with the development of modern technologies, especially the digital transformation of the building industry, new opportunities are provided to improve energy efficiency in buildings. Digital transformation includes the use of advanced technologies and systems to improve building design, construction, and operations. For example, the low-cost Arduino transmittance meter (HEAT), which utilizes non-contact temperature sensors and IoT protocols to record data, excelled in wall and ceiling measurements, further emphasizing the importance of ensuring U-value accuracy. This technology not only improves the accuracy of building U-value assessments but also provides strong support for optimizing building energy efficiency management [6,7].
In addition, digital systems such as Building Information Modeling (BIM) can integrate these data to provide more detailed information and optimize methodologies for building management. BIM enables a more comprehensive analysis and optimization of building design and operations by integrating data from multiple sources [8,9]. However, accurately forecasting energy consumption in buildings remains challenging due to the fluctuating nature of operational activities and varying environmental conditions [10]. The development of models that can capture these variations in real time and provide information for building management becomes crucial [11,12]. In this context, building energy consumption forecasting can be divided into ultra-short-term, short-term, medium-term, and long-term predictions according to the time horizon. Among these, short-term predictions emphasize the strong link to daily energy system operation patterns, offering users cost-effective energy-saving strategies and actionable advice [13]. Using the outcomes of these short-term predictions, the operational approach of future energy systems in buildings can be adjusted to ensure more efficient resource allocation [14], which further supports the realization of the development goal of energy conservation and emission reduction.
Research scholars have proposed various machine learning models for energy consumption prediction and explored the application of these models in optimizing energy consumption in buildings. Priyadarshini et al. [15], Shao et al. [16], and Rahman et al. [17] applied machine learning methods, including integrated models, support vector regression, and Martens distance, within the domain of energy consumption forecasting. Despite the theoretically powerful predictive capabilities of these techniques, practical applications generally face challenges in data quality, model adaptation, and generalization capabilities. Zhang et al. [18], Alvin B. Culaba et al. [19], and Olu-Ajayi et al. [20] investigated the weighted support vector regression (SVR) model, the integration of K-means with SVR, and the use of deep neural networks for predicting building energy consumption. Although these models show certain advantages in dealing with building energy consumption prediction, the practical application still faces problems such as long training time, poor regional adaptability, and insufficient model generalization ability. Elbeltagi and Wefki [21], literature [22], and literature [23] further investigated artificial neural networks (ANNs) based on parametric modeling techniques, occupancy-based deep learning methods, and machine learning techniques in energy consumption prediction for multi-building, which showed the advantages and disadvantages of different methods in terms of accuracy and efficiency, but also exposed limitations such as the lack of real data validation and the limitation of geographic information data. Hosseini and Farad [24] and Dinmohammadi et al. [25] analyzed the performance of decision trees, random forests, K-nearest neighbors, and stacked models in energy consumption prediction, and found that the random forest model is the best in prediction accuracy, but the stacked model has an advantage in accuracy, and the study did not delve into feature selection and dimensionality reduction.
The IPSO method proposed by Li [26] improves the convergence performance and global search efficiency of the algorithm by dynamically adjusting the subpopulation size, combining the concentration adjustment mechanism and adaptive search range adjustment. Sun [27] combines the IPSO and simulated annealing algorithms, which innovatively facilitate the collaborative search operation, effectively prevent the IPSO from falling into the local optimum, and ensure that the search is carried out toward the global optimum point. The IPSO variant proposed by Zhao [28] employs adaptive inertia weights to balance global exploration with local exploitation, which greatly enhances the algorithm’s overall performance and convergence efficiency. In the context of microgrids, literature [29] discusses PV power generation prediction using LSSVM based on the bundle hair cluster algorithm, which is robust to noisy data and effective in capturing complex patterns but is computationally demanding and requires hyperparameter tuning. Similarly, TSA-MLPNN, based on the bundle hair cluster algorithm, is flexible in dealing with nonlinear relationships and uncertainty, but the computational requirements and overfitting problems are more prominent. The sparse learning machine based on LSSVM proposed by Zhang et al. [30] utilizes the sparsity property but is very sensitive to the characteristic parameters and requires fine-tuning to improve the prediction performance, as well as increase the prediction time cost.
These studies show the wide application of machine learning in building energy consumption prediction. However, existing studies suffer from insufficient in-depth discussion of data quality, poor model adaptation under new environmental changes, and model generalization under different geographical and seasonal conditions. To fill these research gaps, this study proposes a new approach that integrates data quality, model adaptability, and generalizability. The method introduces the IPSO optimization algorithm into the parameter selection and adjustment process of LSSVM to enhance the generalization ability and learning performance of the LSSVM model and adopts the mean absolute error and the mean square error as the quantitative evaluation index of the model. Meanwhile, the study also compares the performance of baseline models such as Random Forest, Support Vector Regression, and Backpropagation Neural Network with the IPSO-LSSVM model. The results show that the building energy consumption prediction model based on IPSO-LSSVM reduces the error rate and improves the fitting ability of the model.

2. Prediction Model Based on IPSO-LSSVM

2.1. LSSVM Method

LSSVM [31] is an improved machine learning algorithm designed to optimize the performance of traditional support vector machines. Compared with SVM, LSSVM demonstrates superior performance in addressing the issue of high computational complexity. By replacing the inequality constraint in SVM with an equality constraint and transforming the loss function into the sum of squared errors, LSSVM simplifies the quadratic programming problem of SVM to the solution of linear equations, thereby improving solution speed. Additionally, LSSVM has the potential to handle large-scale problems effectively and can address both linear and nonlinear multivariate calibration problems with relatively high accuracy and generalization ability.

2.2. PSO Method

As a swarm intelligence optimization technique, PSO gradually converges on the best solution during the iterative search by initializing a group of random particles. The algorithm has been widely considered and applied because of its few adjustable parameters, relatively simple implementation process, fast convergence, and high accuracy in dealing with group data optimization. In the PSO algorithm, the position vector of t for each particle in the i-th generation is denoted as x i , the velocity vector as v i , the historical optimal position as P best , and the global optimal position as g best . The particle iteration velocity is influenced by three factors: the velocity of the previous generation v prev , i , the particle’s historical optimal value p best , i , and the global historical optimal value g best , i [32].
In the entire population, each generation of particles updates its speed and position according to the Formulas (1) and (2), eventually finding the optimal value.
V id t + 1 = ω t V id t + c 1 r 1 P id t     X id t + c 2 r 2 P gd t X id t
X id t + 1 = X id t + V id t + 1
where V id t represents the velocity vector of the i th particle in the t generation, ω t is the inertia weight, c 1 ,   c 2 is the learning factor, and r 1 ,   r 2 is the randomness of the random number enhancement algorithm. X id t is the current position vector of particle i in generation t, and V id t + 1 is the current velocity vector of particle i in generation t + 1 .
The simplified principle of the particle swarm optimization algorithm is shown in Figure 1.

2.3. IPSO Method

PSO has the advantages of a simple parameter setting and fast convergence speed in function optimization. However, due to the complexity of the problem itself, PSO can easily fall into a local optimum in the later stage of the iterative process, leading to insufficient population diversity. To overcome these limitations while preserving its advantages, researchers have proposed various improved PSO algorithms (see Reference [33]) or integrated PSO with other intelligent algorithms (see References [34,35]). To address the issue of PSO algorithms easily falling into local optima, an Improved Particle Swarm Optimization (IPSO) algorithm is proposed, focusing on improvements in the following two aspects:
  • Convergence factor is introduced
By adjusting the updating rules of particles or introducing new strategies, the convergence speed and stability of the algorithm can be improved. Specifically, modify Formula (1) as follows:
V id t + 1 = x ω t V id t + c 1 r 1 P id t X id t + c 2 r 2 P gd t X id t
where x = 2 2 φ φ 2 4 φ , and convergence factor φ = c 1 + c 2 > 4 .
Experimental findings indicate that the IPSO algorithm, enhanced with a convergence factor, outperforms the version that relies solely on improved weight. This improvement ensures that in the early stage of optimization, the particle swarm can quickly and effectively converge to the global optimal solution, while maintaining efficiency and diversity when exploring the solution space, effectively avoiding falling into the local optimum problem.
2
Dynamic adjustment of inertia weight and learning constant
In the early stage of the algorithm, a higher inertia weight τ is conducive to enhancing the global search capability, while in the later stage, a lower inertia weight τ is conducive to improving the local search efficiency. Therefore, an inertial weight adjustment scheme based on a parabola is designed:
τ = τ max τ min k k max 2 + τ max τ min 2 k k max + τ max
where τ max and τ min are the upper and lower limits of the inertia weight, respectively, usually with values of 0.9 and 0.4. k representing the current number of iterations, and k max is the predetermined maximum number of iterations.
The learning factors C 1 and C 2 influence the ability to recognize the optimal direction and the ability to respond to peer influence, respectively. In the initial stage of the search, increasing C 1 strengthens the global search ability, while decreasing C 2 limits its influence. Conversely, near the end of the optimization process, decreasing C 1 and increasing C 2 helps the particle quickly locate the global optimal solution. Based on this, a method for dynamically adjusting the learning factors is proposed, as shown in Formulas (5) and (6):
C 1 = ( C min C max ) k k max + C max
C 2 = ( C max C min ) k k max + C min
These adjustments ensure that the algorithm effectively balances the ability to explore and utilize known information at different stages of optimization, thereby improving the efficiency of global searching and avoiding falling into local optimal solutions.
The core idea of the improved IPSO is to conduct an initial search using the standard PSO strategy while ensuring a uniform distribution of the initial particle population. As the particle population gradually converges early, the distribution of particles in the solution space is readjusted to effectively avoid local optimization and accelerate the convergence rate. This approach enhances performance in parameter selection and adjustment processes.

2.4. IPSO-LSSVM Method

The regression principle of LSSVM is as follows: Given a set of training samples, each sample contains the input vector X i and the corresponding output value Y i . Through the nonlinear mapping function φ x , the input vector can be mapped into a high-dimensional space, where linear regression analysis can be performed:
y x = ω T φ x + b
where y ( x ) is the predicted output value; ω T is the weight vector; φ x represents the nonlinear mapping function; b is the offset; and solving the parameter actually translates into solving the optimization problem.
The optimization objective function and constraints of LSSVM are as follows:
m i n J ω , ξ = 1 2 ω T ω + 1 2 γ i = 1 n ξ i 2
y i = w T φ x i + b + ξ i i = 1 , 2 , , n
where ξ ϵ R i × 1 is the error vector; γ is the regularization parameter; y i is the actual output value of training sample i; and φ x i is the high-dimensional eigenvector of the input φ obtained by the nonlinear mapping function x i . ξ i is the relaxation variable.
By introducing the Lagrange multiplier factor λ , Equation (8) is converted to:
L ω ,   ξ ,   λ ,   b = J ω ,   ξ i = 1 n ω T φ x i + b + ξ i y i
On the premise of following KKT optimization conditions, the partial derivatives of the four parameters ω , ξ , λ , and b in Equation (10) are calculated respectively:
L ω = ω i = 1 n λ i φ x i = 0 L ξ i = C e i λ i = 0 L λ i = y i ω T φ x i b ξ i = 0 L b = i = 1 n λ i = 0
where C is the penalty factor of the relaxation variable, and e i is the relaxation variable.
The linear equations are obtained:
b γ = 0 E T E K + I / Y 0 Y
Y = y i , y 2 , , y n T
where b is a vector of γ , Y is the actual output value, E is the identity matrix, and K is the regularization matrix.
By using the kernel function x i , y i   =   φ x i T φ x j to replace the complex dot product operation in the high-dimensional feature space, the prediction model of LSSVM can be obtained as follows:
y x = i , j = 1 n λ i K x i , y i + b
where λ i is the model parameter.
The LSSVM model structure is shown in Figure 2. Its performance is affected by the configuration of hyperparameters, especially the choice of kernel function and its parameters. The soft margin constant and loss parameter are also key factors. Therefore, when establishing the LSSVM model, it is essential to select and adjust these hyperparameters, especially the regularization parameter γ and the kernel function parameter σ . Only by accurate optimization of these parameters can its predictive performance be fully exploited.
Common kernel functions in SVM include the linear, polynomial, Gaussian radial basis (RBF) [36], and sigmoid kernels. When selecting kernel functions, the RBF kernel is widely favored due to its simple parameters, ease of adjustment, and excellent nonlinear mapping ability. Therefore, in constructing the model in this paper, the RBF function is chosen as the kernel function, and its form is as follows:
K x i , y i = e x p x i x j σ 2 2
In the formula, x i and x j are input samples, and x i x j indicates the Euclidean distance between the input samples.
When using LSSVM for prediction, traditional grid search and cross-validation methods are often employed to determine the parameters. However, these methods not only consume considerable time but also tend to find local optimal solutions rather than global ones. To overcome the limitations of traditional optimization methods in tackling complex problems, researchers introduced a series of meta-heuristic algorithms. In this paper, the improved Particle Swarm Optimization (PSO) algorithm is used to fine-tune the regularization parameter γ and the kernel function width coefficient σ of the LSSVM. By enhancing the PSO particle update rule, the Improved PSO (IPSO) algorithm improves the convergence speed and stability, enabling it to identify the optimal parameter combination that minimizes the objective function. This optimization enhances the prediction performance of the LSSVM model, achieving the best possible prediction outcomes.
The objective function of the optimization problem is:
m i n f γ , σ 2 = i = 1 n y i y fi 2
where y i is the actual value and y fi is the predicted value of the model.
The IPSO algorithm optimization process is shown as follows:
  • Begin.
  • Configure the basic parameters of the IPSO algorithm, such as the velocity and position of the particles.
  • The particle swarm’s velocity and position are initialized, and both the initial global and local optimal solutions are identified.
  • Calculate each particle’s fitness value within the swarm.
  • Update particle velocities and positions using the improved formula.
  • Refresh the local (pbest) and global (gbest) optimal solutions within the swarm.
  • Check the termination conditions.
  • The global optimal solution (gbest) of the output particle swarm.
  • End.
Figure 3 shows the IPSO-LSSVM model architecture:
Firstly, anomaly detection and feature analysis were performed on the data, and the parameters of the IPSO algorithm were initialized. These parameters correspond to individuals in the IPSO algorithm. Using these initial parameters, the LSSVM model performs forecasting, calculating errors and fitness values. Subsequently, based on the current optimal position, the velocities and positions of the particles are adjusted to form a new particle swarm. Utilizing the data from this new particle swarm, the LSSVM model is employed again for forecasting, and calculating new errors and fitness values. Next, the new fitness value is compared with the currently recorded optimal solution. If the new fitness value is better, then the optimal solution and its corresponding position are updated; otherwise, the current optimal solution is retained. The above steps are repeated to continuously update the velocity and position of the particles, generate a new particle swarm, and evaluate its fitness until the termination condition is satisfied. After the iteration terminates, the optimal solution and fitness value currently are output. The optimal solution is applied to the LSSVM model for energy consumption prediction.

3. Experimental Analysis

3.1. Experimental Data

3.1.1. Office Building

An office building in the Xiamen area of China was selected for this study. The building is located in a hot-summer and warm-winter climate zone, with a site area of 83,056 m2 and a total floor area of 132,798 m2. The office building has been constructed and is in operation. It is equipped with an efficient building energy monitoring platform that collects detailed energy consumption data online in real time, including power, cooling, and heating energy usage. The data collection process is realized through sensors and data logging equipment, and the data are uploaded in real time to a central database for subsequent analysis.
In order to ensure the reliability and authenticity of the data, the historical energy consumption data provided by the platform were selected as the sample for analysis in this study, which covered the building’s energy consumption information at different time periods, including daily energy consumption records and environmental parameters. The data were cleaned by removing outliers and noise during preprocessing and using standardized methods. These processes ensure the accuracy and consistency of the data. In the research analysis, these high-quality historical data were used to train and validate the energy consumption prediction model, which improved the prediction accuracy of the model and reduced the possible interfering factors.
In this study, IPSO is introduced to adjust the parameters of the radial basis function (RBF) in the LSSVM model. Before optimization, the data are preprocessed, and the parameter with the minimum fitness value is selected as the final optimization parameter. After optimization, the optimal parameter values of the IPSO-LSSVM model based on RBF are γ = 2.263 and σ = 0.205, respectively. The experiment employs the daily energy usage data of office buildings from January 2020 to December 2022, along with five related environmental variables: time, temperature, humidity, precipitation, and wind speed. Through comprehensive study and analysis of 1095 data points over three years, the task of predicting building energy consumption has been accomplished. In this case, some sample example data for office buildings are shown in Table 1, and the energy consumption sample data are shown in Figure 4.

3.1.2. Education Building

In order to verify the generalizability of the energy consumption prediction model in different building types, this study chooses a university in Changchun, China, as a new empirical research object. The campus covers a total area of more than 800,000 m2, including teaching buildings, laboratory buildings, libraries, gymnasiums, swimming pools, and student cultural activity centers, with a total building area of 341,000 m2. The region has four distinct seasons, with shorter and more comfortable summers and long and cold winters. The experiment selected 760 day-by-day historical energy consumption data from April 2018 to April 2020 for validation, in which some sample example data of educational buildings are shown in Table 2, and the sample data are shown in Figure 5.

3.2. Feature Selection

Building energy consumption is influenced by multiple factors such as external climate, architectural layout, envelope structure, functional use, and human operation. For the built public buildings, their planning, structure, and materials have been determined and have less impact on energy consumption. However, external climate conditions and urban microclimates significantly affect the built environment, thereby influencing energy consumption patterns. During the model training process, it was found that weather data had the greatest impact on the building energy consumption prediction results. The significance of this parameter was verified by using weather data from the past three years and comparing the predicted results from different years. In addition, changes in data types (e.g., weekdays, holidays, etc.) significantly affected the internal heat gain, which altered the energy consumption pattern of the air conditioning system. In order to ensure the accuracy of these parameters, a detailed study of the usage pattern of the actual building was conducted and adjusted accordingly in the model.
After screening the variables by stepwise regression, four key eigenvalues were identified as maximum temperature, weather, humidity, and date type, which indicated that the other variables could be substituted and explained by these four eigenvalues, thus simplifying the model and highlighting the main influencing factors. In this study, the data used contained two categorical variables, date type, and weather conditions, which were converted into numerical forms, as shown in Table 3, after the unique heat coding process. To ensure that the prediction model is accurate and to avoid overfitting, the above four characteristic values are used as input variables, and the corresponding building energy consumption is used as the output to establish a combined prediction model.

3.3. Evaluation Index

To assess the model’s performance, this paper uses the average absolute error (MAE), mean squared error (MSE), and coefficient of determination (R2). MAE represents the mean absolute difference between predicted and actual values, providing a direct indication of average prediction error. MSE calculates the average squared difference between forecasts and actual outcomes, a method commonly used in mathematical analysis and optimization. R2 measures how effectively the model elucidates the variance of the target variables, ranging from 0 to 1; values nearer to 1 signify superior model fit and explanatory capability. The mathematical formulas for these assessment criteria are as follows:
M A E = 1 n i = 1 n Y i Y i ^
MSE = 1 n i = 1 n Y i Y i ^ 2
R 2 = 1 1 n i = 1 n Y i Y i ^ 2 i = 1 n Y i Y i ¯ 2
where n is the number of samples, Y i is the ith actual value, Y i ^ is the ith predicted value, and Y i ¯ is the average of the actual values.

3.4. Experimental Results and Analysis

3.4.1. Analysis of Results for Office Building

Based on the proposed IPSO-LSSVM modeling approach, the day-by-day historical energy consumption data of this office building from January 2020 to December 2022, totaling 1076 records, were selected for validating the model performance. The combined energy consumption prediction model was constructed by using 70% (approximately 700 sets) of these data as a training set and the remaining 30% (396 sets) as a test set. In order to evaluate the performance of the combined model more comprehensively, the energy consumption RF, SVR, BP, and LSSVM models, as well as the PSO-LSSVM model, were also constructed as controls. Each of these models has its own characteristics and advantages, which help to provide a comprehensive comparison. RF and SVR are suitable for dealing with complex nonlinear relationships; BP is used for deep learning; LSSVM is suitable for dealing with small samples of data; and PSO-LSSVM incorporates particle swarm optimization to enhance the performance of LSSVM. The day-by-day predictions of these six models are compared with the actual values, as shown in Figure 6.
From the fit of the overall trend of the models in Figure 6, the BP, RF, and SVR models are able to follow the real energy consumption trend in most cases; however, in the violently fluctuating intervals with the sample number between 150 and 250, the BP model shows a more obvious hysteresis, while the RF model has a tendency to be overly smoothed, and the predicted values of the SVR model are low and insufficiently fluctuating with respect to the real values. In contrast, the LSSVM, PSO-LSSVM, and IPSO-LSSVM models provide a better fit between the predicted curve and the true curve when dealing with these intervals of severe fluctuations in energy consumption; in particular, the PSO-LSSVM model is optimized by the PSO algorithm for the LSSVM model, which further improves the overall fit over the LSSVM model. The IPSO-LSSVM model, on the other hand, further enhances its ability to handle complex problems through the IPSO optimization algorithm, which further improves its overall fit compared to the PSO-LSSVM model.
In terms of the model’s ability to capture localized fluctuations, the true value shows a significant drop when the sample size reaches about 60, while the BP model shows only slight fluctuations. The RF model fails to accurately predict these fluctuations in the sample size intervals of 50 to 100 and 200 to 250. Similarly, the SVR model failed to effectively capture these localized fluctuations around sample sizes of 75 and 300. Despite the gap between the predicted and true values of the LSSVM model, the LSSVM model is more sensitive in capturing the fluctuations compared to the BP and RF models. Both PSO-LSSVM and IPSO-LSSVM models show high sensitivity to local nonlinear fluctuations, whereas the IPSO-LSSVM model is more accurate in predicting the effect of peak and valley values. This indicates that the improved IPSO optimization algorithm has significant advantages in improving the accuracy and stability of energy consumption prediction.
In addition to the model’s overall trend fit and ability to capture local fluctuations, this experiment provides a more comprehensive basis from a performance evaluation perspective. Table 4 below demonstrates the prediction errors of the six models on the test sample data, while Figure 7 provides a visualization of the error evaluation metrics.
Table 4 below lists the prediction errors of the six models on the test sample data to provide a more comprehensive basis for model performance evaluation.
As can be seen in Table 4 and Figure 7, the performance comparison of different office building energy consumption prediction models, in which the MAE and MSE of the IPSO-LSSVM model are 0.055 and 0.003, respectively, which are reduced by 20.3% and 40%, respectively, compared to the combined model PSO-LSSVM, shows that compared to other single prediction models, the MAE and MSE of the IPSO-LSSVM model are both significantly reduced, showing its smaller prediction error and better model accuracy in building energy consumption prediction. The R2 of the IPSO-LSSVM model is 0.940, which is especially improved by 0.033 with respect to the suboptimal PSO-LSSVM model. This enhancement not only implies that the IPSO-LSSVM model is able to more accurately explain the data variations, but also significantly improves the prediction reliability of the results, making them closer to the actual situation.

3.4.2. Analysis of Results for Science and Education Building

A total of 760 daily historical energy consumption data points from April 2018 to April 2020 were selected for verification. For model verification, 70% (about 500 groups) of the samples were used as the training set, and 30% (about 260 groups) were used as the test set. The comparison between the daily predicted values and the actual values of the six models is shown in Figure 8, and the error is shown in Table 5.
The overall trend fit of the models in Figure 8 shows that the BP and RF models generally perform well in capturing the overall trend of energy consumption. The BP model predicts a smaller decrease in the actual value of energy consumption when the actual value of energy consumption decreases sharply. The RF model, on the other hand, showed significant high-frequency fluctuations with sample sizes between 100 and 200. Although the SVR model performed better in capturing the overall trend, its predicted values were high and failed to accurately reflect the downward trend in energy consumption after the sample size exceeded 200. The LSSVM model had a better overall fit but failed to accurately capture the sharp change in actual values after the peak between sample sizes 150 and 200. In contrast, the PSO-LSSVM model significantly improves the fit of the overall trend based on the LSSVM, and its prediction curve is highly consistent with the actual values. The IPSO-LSSVM model performs best of all the models, with an overall trend that is almost exactly consistent with the actual values, and captures the peaks in energy consumption with extreme accuracy.
In terms of the model’s ability to capture localized fluctuations, the BP and RF models showed greater fluctuations in their predicted values. For example, in the range of 100 to 150 samples, the BP model predicts relatively small fluctuations, while the RF model fails to accurately reflect the smooth changes in actual values during the rapid increase and decrease phases of energy consumption. The SVR model is able to capture some of the fluctuations in the actual values, but the fluctuations are slightly too large in the peak period of energy consumption when the number of samples is about 150. The LSSVM model reacts more sluggishly in the phase of sharp changes in energy consumption with a number of samples of 200 to 250, and its predictions lag behind the actual values. The LSSVM model reacts more slowly in the phase of sharp changes in energy consumption from sample size 200 to 250, and its predicted values lag behind the actual values. In contrast, the PSO-LSSVM and IPSO-LSSVM models perform well in capturing localized fluctuations, especially the IPSO-LSSVM model shows stronger localized fluctuation prediction ability in the downward trend after the sample size exceeds 200.
For a more comprehensive assessment of the actual performance of these models, further reference can be made to the performance data in Table 5 and Figure 9.
The performance comparison of various educational building energy prediction models is presented in Table 5 and Figure 9. The MAE of the IPSO-LSSVM model is 0.055, which is reduced by 49.1%, 54.2%, 45%, 40.2%, and 6.8% compared to the RF model, the SVR model, the BP model, the LSSVM model, and the PSO-LSSVM model, respectively. In terms of MSE, the IPSO-LSSVM model is 0.005, which is 76.2%, 75%, 70.6%, 66.7%, and 16.7% lower compared to the RF, SVR, BP, LSSVM, and PSO-LSSVM models, respectively. In addition, the R2 value of the IPSO-LSSVM model is 0.912, which is improved by 62.3%, 55.4%, 37.9%, 27.4%, and 3.3% compared to the RF, SVR, BP, LSSVM, and PSO-LSSVM models, respectively. These comparative results indicate that the IPSO-LSSVM model excels in predicting energy consumption for educational buildings, but also has a smaller prediction error and is closer to the actual energy consumption data.

4. Conclusions

This paper presents a building energy prediction method using IPSO and LSSVM to improve model accuracy and reliability. The key hypothesis of the study is that by globally optimizing the parameters of the LSSVM model, the model’s ability to deal with nonlinear features and complex relationships can be significantly enhanced, thus improving the prediction accuracy and generalization ability. In the experiments, the effectiveness of the method is verified using two datasets, and the results show that the IPSO-LSSVM model performs well in energy consumption prediction, with a significantly lower prediction error rate and a significantly higher fitting ability than the traditional BP and SVR models, and is able to more accurately predict the energy consumption of buildings. The main results of the study include (1) The global optimization of LSSVM parameters by IPSO significantly reduces the prediction error rate prediction error rate. (2) The IPSO-LSSVM model demonstrated high fitting ability and was able to accurately capture the trend of energy consumption data. (3) The method demonstrates strong generalization ability on different datasets.
The practical application value of the research is mainly reflected in (1) Improving prediction accuracy: the IPSO-LSSVM model helps building managers identify potential energy wastage and optimize energy use strategies by optimizing the prediction accuracy, thus reducing operational costs and environmental impacts. (2) Improve building energy management: The method provides accurate data support, enabling the energy system operation mode to be adjusted based on real-time predictions, improving energy efficiency and comfort. (3) Enhance the generalization ability of the model: the method shows strong adaptability under different building and environmental conditions, which improves the flexibility and applicability of practical applications. (4) Promoting technological progress in the building industry: it demonstrates how to solve the building energy consumption prediction problem using advanced optimization algorithms and machine learning techniques, providing innovative ideas for future energy management systems.
The limitations of the study are mainly outlined in the following three aspects: (1) The complex optimization process and model structure of the IPSO-LSSVM model thus leads to a lack of intuitive interpretations in practical applications, which restricts its acceptability and trustworthiness in practical applications. (2) Although the model performs well on offline data, it consumes large computational resources in real-time energy consumption prediction applications, which requires further optimization of computational complexity and response speed. (3) The experimental data are mainly from historical datasets and have not been validated on a large scale in real building energy management systems. Therefore, future research should focus on solving the problem of the lack of intuitive interpretability of the IPSO-LSSVM algorithm in practical applications and large-scale validation in actual building energy management systems to verify the practicality and effectiveness of the model.

Author Contributions

Conceptualization, S.Z., Y.C., H.L. and G.Y.; methodology, S.Z., Y.C., H.L. and G.Y.; software, S.Z., Y.C., H.L. and G.Y.; validation, S.Z., Y.C., H.L. and G.Y.; investigation, S.Z., Y.C., H.L. and G.Y.; data curation, S.Z., Y.C., H.L. and G.Y.; writing—original draft preparation, S.Z., Y.C., H.L. and G.Y.; writing—review and editing, S.Z., Y.C., H.L. and G.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the Jilin Provincial Science and Technology Department Project Green Campus Electric Heating Intelligent Data Analysis Research and Application (project No. 20210203103SF).

Data Availability Statement

The data are not publicly available due to privacy concerns.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
RFRandom Forest
BPBackpropagation
SVRSupport Vector Regression
PSOParticle Swarm Optimization
IPSOImproved Particle Swarm Optimization
LSSVMLeast Squares Support Vector Machine

References

  1. Lu, S.; Liu, Y.; Sun, Y.; Yin, S.; Jiang, X. Indoor thermal environmental evaluation of Chinese green building based on new index OTCP and subjective satisfaction. J. Clean. Prod. 2019, 240, 118151. [Google Scholar] [CrossRef]
  2. Li, Z.; Zhao, Y.; Xia, H.; Xie, S. A multi-objective optimization framework for building performance under climate change. J. Build. Eng. 2023, 80, 107978. [Google Scholar] [CrossRef]
  3. Xu, G.; Wang, W. China’s energy consumption in construction and building sectors: An outlook to 2100. Energy 2020, 195, 117045. [Google Scholar] [CrossRef]
  4. Wu, X.; Li, X.; Qin, Y.; Xu, W.; Liu, Y. Intelligent multiobjective optimization design for NZEBs in China: Four climatic regions. Appl. Energy 2023, 339, 120934. [Google Scholar] [CrossRef]
  5. Sun, Y.; Haghighat, F.; Fung, B.C.M. A review of the state-of-the-art in data-driven approaches for building energy prediction. Energy Build. 2020, 221, 110022. [Google Scholar] [CrossRef]
  6. Mobaraki, B.; Pascual, F.J.C.; Lozano-Galant, F.; Lozano-Galant, J.A.; Soriano, R.P. In situ U-value measurement of building envelopes through continuous low-cost monitoring. Case Stud. Therm. Eng. 2023, 43, 102778. [Google Scholar] [CrossRef]
  7. Mobaraki, B.; Pascual, F.J.C.; García, A.M.; Mascaraque, M.Á.M.; Vázquez, B.F.; Alonso, C. Studying the impacts of test condition and nonoptimal positioning of the sensors on the accuracy of the in-situ U-value measurement. Heliyon 2023, 9, e17282. [Google Scholar] [CrossRef]
  8. Piras, G.; Muzi, F.; Tiburcio, V.A. Digital Management Methodology for Building Production Optimization through Digital Twin and Artificial Intelligence Integration. Buildings 2024, 14, 2110. [Google Scholar] [CrossRef]
  9. Muzi, F.; Marzo, R.; Nardi, F. Digital Information Management in the Built Environment: Data-Driven Approaches for Building Process Optimization. In Technological Imagination in the Green and Digital Transition. CONF.ITECH 2022; Arbizzani, E., Cangelli, E., Clemente, C., Cumo, F., Giofrè, F., Giovenale, A.M., Palme, M., Paris, S., Eds.; The Urban Book Series; Springer: Cham, Switzerland, 2023. [Google Scholar] [CrossRef]
  10. López Gómez, J.; Troncoso Pastoriza, F.; Fariña, E.A.; Oller, P.E.; Álvarez, E.G. Use of a numerical weather prediction model as a meteorological source for the estimation of heating demand in building thermal simulations. Sustain. Cities Soc. 2020, 62, 102403. [Google Scholar] [CrossRef]
  11. Wang, Z.; Xia, L.; Yuan, H.; Srinivasan, R.S.; Song, X. Principles, research status, and prospects of feature engineering for data-driven building energy prediction: A comprehensive review. J. Build. Eng. 2022, 58, 105028. [Google Scholar] [CrossRef]
  12. Li, Y.; Tong, Z.; Tong, S.; Westerdahl, D. A data-driven interval forecasting model for building energy prediction using attention-based LSTM and fuzzy information granulation. Sustain. Cities Soc. 2022, 76, 103481. [Google Scholar] [CrossRef]
  13. Mariano-Hernández, D.; Hernández-Callejo, L.; Solís, M.; Zorita-Lamadrid, A.; Duque-Perez, O.; Gonzalez-Morales, L.; Santos-García, F. A Data-Driven Forecasting Strategy to Predict Continuous Hourly Energy Demand in Smart Buildings. Appl. Sci. 2021, 11, 7886. [Google Scholar] [CrossRef]
  14. Fang, X.; Gong, G.; Li, G.; Chun, L.; Li, W.; Peng, P. A hybrid deep transfer learning strategy for short term cross-building energy prediction. Energy 2021, 215, 119208. [Google Scholar] [CrossRef]
  15. Priyadarshini, I.; Sahu, S.; Kumar, R.; Taniar, D. A machine-learning ensemble model for predicting energy consumption in smart homes. Internet Things 2022, 20, 100636. [Google Scholar] [CrossRef]
  16. Shao, M.; Wang, X.; Bu, Z.; Chen, X.; Wang, Y. Prediction of energy consumption in hotel buildings via support vector machines. Sustain. Cities Soc. 2020, 57, 102128. [Google Scholar] [CrossRef]
  17. Rahman, S.; Rabiul Alam, M.G.; Mahbubur Rahman, M. Deep Learning based Ensemble Method for Household Energy Demand Forecasting of Smart Home. In Proceedings of the International Conference on Computer and Information Technology (ICCIT), Dhaka, Bangladesh, 18–20 December 2019; pp. 1–6. [Google Scholar] [CrossRef]
  18. Zhang, F.; Deb, C.; Lee, S.E.; Yang, J.; Shah, K.W. Time series forecasting for building energy consumption using weighted Support Vector Regression with differential evolution optimization technique. Energy Build. 2016, 126, 94–103. [Google Scholar] [CrossRef]
  19. Culaba, A.B.; Del Rosario, A.J.; Ubando, A.T.; Chang, J.S. Machine learning-based energy consumption clustering and forecasting for mixed-use buildings. Int. J. Energy Res. 2020, 44, 9659–9673. [Google Scholar] [CrossRef]
  20. Olu-Ajayi, R.; Alaka, H.; Sulaimon, I.; Sunmola, F.; Ajayi, S. Building energy consumption prediction for residential buildings using deep learning and other machine learning techniques. J. Build. Eng. 2022, 45, 103406. [Google Scholar] [CrossRef]
  21. Elbeltagi, E.; Wefki, H. Predicting energy consumption for residential buildings using ANN through parametric modelling. Energy Rep. 2021, 7, 2534–2545. [Google Scholar] [CrossRef]
  22. Truong, L.H.M.; Chow, K.H.K.; Luevisadpaibul, R.; Thirunavukkarasu, G.S.; Seyedmahmoudian, M.; Horan, B.; Mekhilef, S.; Stojcevski, A. Accurate Prediction of Hourly Energy Consumption in a Residential Building Based on the Occupancy Rate Using Machine Learning Approaches. Appl. Sci. 2021, 11, 2229. [Google Scholar] [CrossRef]
  23. Pham, A.-D.; Ngo, N.-T.; Ha Truong, T.T.; Huynh, N.-T.; Truong, N.-S. Predicting energy consumption in multiple buildings using machine learning for improving energy efficiency and sustainability. J. Clean. Prod. 2020, 260, 121082. [Google Scholar] [CrossRef]
  24. Hosseini, S.; Fard, R.H. Machine Learning Algorithms for Predicting Electricity Consumption of Buildings. Wirel. Pers. Commun. 2021, 121, 3329–3341. [Google Scholar] [CrossRef]
  25. Dinmohammadi, F.; Han, Y.; Shafiee, M. Predicting Energy Consumption in Residential Buildings Using Advanced Machine Learning Algorithms. Energies 2023, 16, 3748. [Google Scholar] [CrossRef]
  26. Li, H.; Wang, S.; Chen, Q.; Gong, M.; Chen, L. IPSMT: Multi-objective optimization of multipath transmission strategy based on improved immune particle swarm algorithm in wireless sensor networks. Appl. Soft Comput. 2022, 121, 108705. [Google Scholar] [CrossRef]
  27. Sun, J.; Che, Y.; Yang, T.; Zhang, J.; Cai, Y. Location and Capacity Determination Method of Electric Vehicle Charging Station Based on Simulated Annealing Immune Particle Swarm Optimization. Energy Eng. 2023, 120, 367–384. [Google Scholar] [CrossRef]
  28. Zhao, D.; Feng, S.; Cao, Y.; Yu, F.; Guan, Q.; Li, J.; Zhang, G.; Xu, T. Study on the classification method of rice leaf blast levels based on fusion features and adaptive-weight immune particle swarm optimization extreme learning machine algorithm. Front. Plant Sci. 2022, 13, 879668. [Google Scholar] [CrossRef] [PubMed]
  29. Tayab, U.B.; Yang, F.; Metwally, A.S.M.; Lu, J. Solar photovoltaic power forecasting for microgrid energy management system using an ensemble forecasting strategy. Energy Sources Part A Recovery Util. Environ. Eff. 2022, 44, 10045–10070. [Google Scholar] [CrossRef]
  30. Zhang, L.; Li, K.; Du, D.; Guo, Y.; Fei, M.; Yang, Z. A sparse learning machine for real-time SOC estimation of li-ion batteries. IEEE Access 2020, 8, 156165–156176. [Google Scholar] [CrossRef]
  31. Xu, G.; Gao, G.; Hu, M. Detecting spammer on micro-blogs base on fuzzy multi-class SVM. In Proceedings of the 2018 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC), Zhengzhou, China, 18–20 October 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 24–247. [Google Scholar] [CrossRef]
  32. Chen, B.; Liu, Q.; Chen, H.; Wang, L.; Deng, T.; Zhang, L.; Wu, X. Multiobjective optimization of building energy consumption based on BIM-DB and LSSVM-NSGA-II. J. Clean. Prod. 2021, 294, 126153. [Google Scholar] [CrossRef]
  33. Yu, M.; Liu, B.; Tang, E. Remote Sensing Image Classification Based on Improved PSO Support Vector Machine. Spacecr. Recovery Remote Sens. 2018, 39, 133–140. [Google Scholar] [CrossRef]
  34. Zeng, N.; Hong, Z.; Liu, W. A Switching Delayed PSO Optimized Extreme Learning Machine for Shortern Load Forecasting. Neurocomputing 2017, 240, 175–182. [Google Scholar] [CrossRef]
  35. Li, D.; Li, L.; Zhang, R. Water Quality pH Value Determination for Visible-Near Infrared Spectroscopy Based on SPA and PSO-LSSVM. Laser Optoelectron. Prog. 2023, 60, 390–395. [Google Scholar] [CrossRef]
  36. Xie, L.R.; Wang, B.; Bao, H.Y.; Liang, W.; Maimaitireyimu, A. Super-Short-Term Wind Power Forecasting Based On EEMD-WOA-LSSVM. Acta Energiae Solaris Sin. 2021, 43, 94. [Google Scholar] [CrossRef]
Figure 1. Simplified principle of particle swarm optimization.
Figure 1. Simplified principle of particle swarm optimization.
Energies 17 04329 g001
Figure 2. LSSVM model structure.
Figure 2. LSSVM model structure.
Energies 17 04329 g002
Figure 3. IPSO-LSSVM model architecture Figure.
Figure 3. IPSO-LSSVM model architecture Figure.
Energies 17 04329 g003
Figure 4. Sample office building energy consumption data.
Figure 4. Sample office building energy consumption data.
Energies 17 04329 g004
Figure 5. Sample of science and education building energy consumption data.
Figure 5. Sample of science and education building energy consumption data.
Energies 17 04329 g005
Figure 6. Comparison of energy consumption prediction results of six models in office building: (a) RF; (b) SVR; (c) BP; (d) LSSVM; (e) PSO-LSSVM; (f) IPSO-LSSVM.
Figure 6. Comparison of energy consumption prediction results of six models in office building: (a) RF; (b) SVR; (c) BP; (d) LSSVM; (e) PSO-LSSVM; (f) IPSO-LSSVM.
Energies 17 04329 g006
Figure 7. Histogram of evaluation indicators of the prediction model.
Figure 7. Histogram of evaluation indicators of the prediction model.
Energies 17 04329 g007
Figure 8. Comparison of prediction results of six different models in science and education Building: (a) RF; (b) SVR; (c) BP; (d) LSSVM; (e) PSO-LSSVM; (f) IPSO-LSSVM.
Figure 8. Comparison of prediction results of six different models in science and education Building: (a) RF; (b) SVR; (c) BP; (d) LSSVM; (e) PSO-LSSVM; (f) IPSO-LSSVM.
Energies 17 04329 g008
Figure 9. Histogram of model prediction evaluation indicators.
Figure 9. Histogram of model prediction evaluation indicators.
Energies 17 04329 g009
Table 1. Example of office building data.
Table 1. Example of office building data.
DateEnergy Consumption (kgce)Maximum Temperature
(°C)
Minimum Temperature
(°C)
WeatherHumidity
(hPa)
Wind Speed
(M/S)
Date Type
27 February 20204894.019.46.2Rain823Monday
14 August 20206990.2933.424.7Sunny742Tuesday
15 April 20215405.3128.915.6cloudy751Thursday
22 January 20225109.7118.51.3Snow394Friday
12 October 20226047.403221.5Cloudy713Saturday
Table 2. Example of educational building data.
Table 2. Example of educational building data.
DateEnergy Consumption (kgce)Maximum Temperature
(°C)
Minimum Temperature
(°C)
WeatherWind Speed
(M/S)
Date Type
29 May 2018744.301811Rain3Workdays
16 December 20186725.04−11−20Sunny3Workdays
22 March 20193866.2928Snow2Workdays
8 June 2019783.762315Cloudy3Holidays
23 November 20193733.513−5Cloudy3Holidays
Table 3. Encoding of feature values.
Table 3. Encoding of feature values.
FormOriginal ValueCoded Value
Office Building Date TypeMonday0000001
Tuesday0000010
Wednesday0000100
Thursday0001000
Friday0010000
Saturday0100000
Sunday1000000
Education Building Date TypeWorkdays01
Holidays10
WeatherRain00001
Sunny00010
Snow00100
Cloudy01000
Cloudy10000
Table 4. Comparison of error metrics for office building energy prediction models.
Table 4. Comparison of error metrics for office building energy prediction models.
ModelsMAEMSER2
RF0.1150.0200.556
SVR0.1150.0170.616
BP0.1100.0170.621
LSSVM0.0690.0090.779
PSO-LSSVM0.0690.0050.907
IPSO-LSSVM0.0550.0030.940
Table 5. Comparison of error metrics for education building energy prediction models.
Table 5. Comparison of error metrics for education building energy prediction models.
ModelsMAEMSER2
RF0.1080.0210.562
SVR0.1200.0200.587
BP0.1000.0170.661
LSSVM0.0920.0150.716
PSO-LSSVM0.0590.0060.883
IPSO-LSSVM0.0550.0050.912
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhang, S.; Chang, Y.; Li, H.; You, G. Research on Building Energy Consumption Prediction Based on Improved PSO Fusion LSSVM Model. Energies 2024, 17, 4329. https://doi.org/10.3390/en17174329

AMA Style

Zhang S, Chang Y, Li H, You G. Research on Building Energy Consumption Prediction Based on Improved PSO Fusion LSSVM Model. Energies. 2024; 17(17):4329. https://doi.org/10.3390/en17174329

Chicago/Turabian Style

Zhang, Suli, Yiting Chang, Hui Li, and Guanghao You. 2024. "Research on Building Energy Consumption Prediction Based on Improved PSO Fusion LSSVM Model" Energies 17, no. 17: 4329. https://doi.org/10.3390/en17174329

APA Style

Zhang, S., Chang, Y., Li, H., & You, G. (2024). Research on Building Energy Consumption Prediction Based on Improved PSO Fusion LSSVM Model. Energies, 17(17), 4329. https://doi.org/10.3390/en17174329

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop