Short-Term Occupancy Forecasting for a Smart Home Using Optimized Weight Updates Based on GA and PSO Algorithms for an LSTM Network

: In this work, we provide a smart home occupancy prediction technique based on environmental variables such as CO 2 , noise, and relative temperature via our machine learning method and forecasting strategy. The proposed algorithms enhance the energy management system through the optimal use of the electric heating system. The Long Short-Term Memory (LSTM) neural network is a special deep learning strategy for processing time series prediction that has shown promising prediction results in recent years. To improve the performance of the LSTM algorithm, particularly for autocorrelation prediction, we will focus on optimizing weight updates using various approaches such as Genetic Algorithm (GA) and Particle Swarm Optimization (PSO). The performances of the proposed methods are evaluated using real available datasets. Test results reveal that the GA and the PSO can forecast the parameters with higher prediction ﬁdelity compared to the LSTM networks. Indeed, all experimental predictions reached a range in their correlation coefﬁcients between 99.16% and 99.97%, which proves the efﬁciency of the proposed approaches.


Introduction
One of the most efficient systems to save energy is to reduce a building's heating and cooling load, which is mostly caused by heat transfer over its envelope.Smart buildings are required to provide permanent, healthy and comfortable indoor environments, independent of exterior weather conditions [1,2].Indeed, the major part of energy in such buildings is used by Heating, Ventilation, and Air Conditioning (HVAC) systems, which have a significant influence on both home comfort and the environment.Therefore, managing these systems in residential structures should be tackled in order to increase energy efficiency through improved energy planning [3].One of the most essential features of smart buildings is their ability to self-control the systems used to maintain the comfort of the inside atmosphere while also minimizing energy use.Because HVAC systems are the primary source of energy consumption in buildings, intelligent HVAC system control is a current trend in research studies that necessitates the insertion of occupancy information into the control process [4].Moreover, the rise of smart buildings, as well as the pressing need to reduce energy use, has rekindled interest in building energy demand prediction.Intelligent controls are a solution for optimizing power consumption in buildings without reducing interior comfort [5].For example, in [6], a Model Predictive Control (MPC) is developed to obtain a hybrid HVAC control with energy savings while maintaining of thermal comfort.Building energy consumption prediction strives to achieve various goals such as evaluating the impact of energy-saving interventions and assume energy demands based on regular requirements.It can anticipate the fluctuations in power consumption of certain events at specfic times that may modify the systems' customary energy usage [7].Furthermore, based on detailed and extensive studies, it was concluded that occupant behavior is one of the most significant elements affecting residential structure energy use.Occupancy behavior includes activities such as turning on and off lights, switching on and off heating and cooling systems, and regulating the temperature.
Previous research has shown that various occupant demands and behaviors necessitate specific technological solutions, which may induce or change behavior patterns, and that occupant behavior affects the flexibility and deployment of technologies.However, the lack of comprehensive knowledge of occupant behaviors in residential building leads to misunderstanding and inaccurate decisions in both technical design and policy making [8].The context of our research is energy efficiency.In recent years, energy efficiency has been realized by improving the thermal performances of the building envelope's insulation layer.The research strategies aim to permanently adjust the comfort conditions to the living situation, as well as to ensure greater energy supervision and management within the smart buildings.To achieve this, it is important to automatically characterize the activities of the building's residents.The significant challenge in today's new technical design for smart buildings is understanding customer behaviors [9].In the future, our occupancy prediction approach will guarantee energy savings in a smart building environment.Ambient intelligence is an important prerequisite for improving human quality of life.
The rest of this work is structured as follows: Section 2 explains the technique employed in this project.Firstly, it offers the overall framework of the LSTM forecasting model.Next, it presents, step-by-step, the implementation procedure of the suggested technique; it includes descriptions of database processing, the parameters, and the assessment indicators.Section 3 features experimental details, as well as an analysis of the results.Finally, Section 4 provides some conclusions and future works.

Related Works
Building energy consumption is influenced by the thermal insulation, heating, ventilation, air conditioning, lighting, and occupants' behaviors [10].Characterising human activity has become an increasingly prominent application of machine learning in a many disciplinary fields.Indeed, for the past two decades, researchers from several application fields have investigated activity recognition by developing a variety of methodologies and techniques for each of these key tasks.The prediction of human behaviour represents a key challenge, and many approaches have already been proposed in the industrial, medical, home care, and energy efficiency domains, and many others [11].For example, in [12], an end-to-end technique for forecasting multi-zone interior temperatures using LSTM-based sequence to sequence has been introduced.The goal of this prediction is to improve the building's energy efficiency while maintaining occupant comfort.Authors in [13] also proposed implementing simple XGBoost machine learning methods to predict the interior room temperature, relative humidity, and CO 2 concentration in a commercial structure.The proposed technique presents a practical option because it does not require a large data set for training.Additionally, these models eliminate the necessity for multiple sensors, which create sophisticated and expensive networks.In [14], a short-term load consumption forecasting approach for nonresidential buildings using artificial occupancy attributes and based on Support Vector Machines (SVM) has been developed.However, the determination of human behaviour in this work is imprecise.The authors in [15], present a load forecasting model for office buildings based on artificial intelligence and regression analysis to effectively extract the cooling and heating load characteristics.However, the model assumes that the building's internal disturbing influences are steady.In [16], an optimal deep learning LSTM model for forecasting electricity consumption utilizing feature selection and a Genetic Algorithm (GA) is implemented.The goal of this suggested technique is to determine the optimal time delays and number of layers for LSTM architecture's predictive performance optimisation, as well as to minimize overfitting, resulting in more accurate and consistent forecasting.Furthermore, recently, machine learning approaches based on Artificial Neural Networks (ANNs) have been widely used to forecast the thermal behavior of modern buildings for modeling HVAC systems.As an example, in [17], four comparative models have been developed and refined to forecast the inside temperature of a public building.These proposed techniques can be adapted to various scenarios.However, we must keep in mind that the adoption of an online technique such as OMLP (Online MultiLayer Perceptron) might be influenced by outliers.The authors also in [18] tackle a non-linear autoregressive neural network methodology for forecasting interior air temperature in the short and medium terms.Realistic artificial temperature data are used to train the proposed model.The goal of this strategy is to make up for the lack of real-world data collected by sensors in energy experiments.Thus, an improved technique integrating real-time information and addressing possible noise or missing data is necessary to prove the reliability of the proposed strategy in real scenarios.Differently from previous research solutions, which typically rely on a basic and simple LSTM model, we designed an optimised architecture exploiting GA and PSO algorithms to update weights and select the optimal values that give the best prediction precision and reduce model overfitting.As a matter of fact, these two methods (PSO and GA) were chosen due to their good reputation in the literature, and they add a stochastic approach to the neural network that resulted in better performance.We compared our results with the LSTM method, which is considered the best neural approach in time series forecasting, as proven in previously conducted works based on LSTM.As an example, Ref. [19] introduces comprehensive comparative studies that include several deep learning methods used in forecasting extrashort-term Plug-in Electric Vehicle (PEV) charging loads such as ANN, RNN, LSTM, gated recurrent units (GRU), and bi-directional long short-term memory (Bi-LSTM).Among these approaches, the LSTM model outperforms the others, and it is competent in giving satisfactory results.

Data Description
A year of data were collected from a smart home between 1 January 2018 and 31 December 2018 with a resolution of 10 min.Each room of the house was equipped with several sensors, including set points of the room temperature, CO 2 concentration, pressure, noise, lighting, and occupancy: The concentration of these factors varied depending on the room; for example, the concentration of CO 2 in the living room differed from that in the office or the kitchen.Moreover, the CO 2 variable does not have a direct relationship with the interior temperature.However, because CO 2 is a strong predictor of room occupancy, it may have a direct impact on the indoor temperature during the cold season.The variation in the CO 2 , the noise, and the temperature are given by Figures 1-3, respectively.

Data Pre-Processing
The prediction of building energy use based on an occupant behavior assessment is a multivariate time series issue in which sensors create data that may contain uncertainty, redundancy, missing values, non-unified time intervals, noise, and so on.Traditional machine learning techniques struggle to reliably anticipate power usage due to unpredictable trend components and seasonal trends.The collection of suitable data contributes to efficiently addressing prediction challenges.As a result, several considerations should be made [20].So, numerous techniques have been proposed to obtain meaningful inferences and insights; nevertheless, these solutions are still in the early phases of development.Therefore, current research is focusing on improving the procedures for processing and cleaning the collected data in order to produce accurate prediction [21].

Missing Values
Many real-world datasets may include missing values for various reasons.So, training a model using a dataset that has a large number of missing values can have a considerable influence on the machine learning model's quality.To prevent information leakage, missing data were interpolated using Exponential Moving Average (EMA).This method is described in [22].

Normalisation
The data for a sequence prediction problem probably need to be normalised to the range of [−1, +1] when training a neural network such as a long short-term memory recurrent neural network.When a network is fit on unscaled data, it is possible for large inputs to slow down the learning and convergence of that network and, in some cases, prevent the network from effectively learning the problem.The Z-score is used for the normalization, and the formula is given as [23]: where: (2) and n is the number of time periods.

Modeling Approaches
The main aim of this research is to investigate the performance of various occupancy forecasting strategies to identify the most accurate ones.In fact, we choose three distinct methods, based on a deep learning method: GA-LSTM and PSO-LSTM as optimiser basedmodels and LSTM as a simple deep learning technique.

LSTM Architecture
Recurrent Neural Networks (RNNs) struggle with learning long-term dependencies.LSTM-based models are an extension of RNNs that can solve the vanishing gradient problem and exploding gradient problem of RNNs and which perform more favorably than RNN on longer sequences.LSTM models basically expand the memory of RNNs to allow them to maintain and learn long-term input dependencies properly.This memory expansion can recall data for a longer amount of time, allowing them to read, write, and delete information from their memories.The LSTM memory is referred to as a "gate" structure because it has the power to decide whether to keep or discard memory information [24,25].A gate is a way of transferring information selectively that includes a sigmoid neural network layer and a bitwise multiplication operations.The LSTM process and mathematical representation consists mostly of the four phases listed below [26]: 1. Deciding to remove useless information: where f t represents the forget gate and σ is the sigmoid activation function and it can be defined as: This function is utilized for this gate to decide what information should be removed from the LSTM's memory.This decision is mainly dependent on the values of the previous hidden layer output h t−1 and the input x t .The output f t takes a value between 0 and 1, where 0 means fully discard the learned value and 1 means preserve the entire value.w f is the recurrent weight matrix, while b f is the bias term.

Updating information:
in which i t is the input gate and denotes if the value needs to be updated or not and ct designates a vector of new candidate values that will be added into the LSTM memory.Indeed, the sigmoid layer determines which values require updating, and the tanh layer generates a vector of new candidate values.

Updating the cell status:
where c t and c t−1 represent the current and previous memory states, respectively.This phase is carried out by updating the previous cell's state, multiplying the old value by f t , deleting the information to be forgotten, and adding i t * ct to generate a new candidate value.

Outputting information:
where o t is the output gate and h t is the current hidden layer outputs whose representations are a value between −1 and 1.This step defines the ultimate result.To begin, a sigmoid layer, represented by o t , selects which part of the cell state will be output.The cell state is then processed by the tanh activation function and multiplied by the sigmoid layer output to create the output.
A typical LSTM network is seen in Figure 4. LSTM layers are composed of memory blocks rather than neurons.These memory blocks are interconnected across the layers, and each block may contain one or more recurrently connected memory elements or cells.As indicated in this figure (yellow shaded area), the flow of information is managed by three types of gates: the forget gate ( f t ), the input gate (I t ), and the output gate (O t ).

LSTM Model Settings and Optimisation
Optimizing an LSTM model entails establishing a set of model parameters that yields the best model performance.The number of units and hidden layers and the optimiser, activation function, batch size, and learning rate are typical examples of such elements.So, the choice of a suitable algorithm is critical to success in addressing any type of optimisation issue.Wolpert and Macready demonstrated this in their "no free lunch" theorem, which states that no method is perfect for solving every type of optimisation issue.As a result, the basic idea is to select an effective optimisation approach to solve a given hand-in optimisation problem with less computational effort and a greater rate of convergence [27].

Genetic Algorithm (GA)
Genetic algorithms (GAs) have been around for over four decades.GAs are heuristic search algorithms that provide answers to optimisation and search problems.The name "GA" is derived from the biological terminology of natural selection, crossing, and mutation.In reality, GAs simulate natural evolutionary processes [28].Thus, a literature review provides many instances of using GA in the analysis and optimisation of various elements from many sectors, such as energy systems.Moreover, GA can be used for the optimisa-tion of ANN predictions or for the optimisation of ANN architecture [29].GAs provide a general and global optimisation process.Since the GA is a global search technique, it will be less vulnerable to local search flaws such as back-propagation.The GA may be used to design the network's architecture as well as its weight.There have been various attempts to utilise GAs to determine the architecture of a neural network and the link weights for a fixed architecture network.Many attempts have been made to use a GA to determine the architecture as well as the link weights.
x 2 x n x t

Particle Swarm Optimization (PSO)
The particle swarm optimisation (PSO) method is a swarm-based stochastic optimisation approach introduced by Eberhart and Kennedy (1995).This technique replicates the social behavior of birds inside a flock to reach the food objective.A swarm of birds approaches their food goal using a combination of personal and communal experience.They constantly update their position based on their best position as well as the best position of the entire swarm, and reunite themselves to form an ideal configuration [30].This nature-inspired method is becoming increasingly popular due to its reliability and easy implementation.In addition, classical neural networks do not operate well when forecasting parameters within short intervals.Moreover, because of their dependability, hybrid ANNs based on particle swarm optimisation have been frequently advocated in literature reviews.The PSO method, like the GA, is used as an optimisation technique within neural networks to optimise ANN forecasts or ANN architecture (the number of layers, neurons, etc.) [31].Thus, we use this algorithm to optimise the weights.

LSTM Network Parameters
The network's trainable parameters, known as the trainable weights, influence the network's complexity.They are represented in LSTMs via connections between the input, hidden, and output layers, as well as internal connections.The following formula is used to calculate the Number of Trainable Weights (NTW) of a neural network with x inputs, y outputs, and z LSTM cells in the hidden layer: where: -4xz: the connection weights between the input layer and the hidden layer; -4zz: the hidden layer's recursive weights; -4z: the hidden layer's bias; -yz: the connection weights between the hidden layer and the output layer; -y: the output layer's bias.
Choosing ideal neural network settings can frequently imply the difference between mediocre and peak performance.However, there is limited information in the literature on the selection of different neural network parameters x, y, and z; it requires the expertise of professionals.

Train-Validation-Test dataset
The one-year target variables were divided into three datasets: the first served as the training set, the second served as the test set, and depending on the length of the output sequence, random samples drawn from the last part served as the validation set.So, for the validation, we use cross-validation, which is a popular data resampling approach for estimating the true forecasting prediction error of models and tuning model parameters.This technique evaluates the generalization capabilities of prediction models and prevents over-fitting.It is the process of generating numerous train-test splits from the training data, which are then applied to adjust the model [32] .k-fold cross-validation is identical to repeated random sub-sampling, but the sampling is performed in such a manner that no two test sets overlap.The available learning set is divided into k disjoint subsets of about equivalent size.Indeed, each time, one of the k subsets is utilised as the validation/test batch, while the remaining (k−1) subsets are combined to form the training set.The total efficacy of the model is calculated by averaging the error estimation over all k trials.Each sample is placed in a validation/test set precisely once and in the training set (k−1) times [33].Figure 5 illustrates this process as a popular evaluation mechanism in machine learning.We train the LSTM with various architectures for 12-h forecasting of thermal parameters such as CO 2 , noise, and temperature.As a result, the window size of the input and output parameters is determined by the time scale of the chosen parameter prediction.We apply the ADAM optimiser, which is one of the optimisation methods employed in deep learning.The learning rate is fixed to 0.01 and gradually drops after every 50 epochs.We train the LSTM with 60, 60, and 100 hidden units for the forecasting of the CO 2 , the noise, and the temperature, respectively.The window size of the input and output parameters depends on the time scale of the load prediction.The validation and training results of each parameter are illustrated in Figures 6-8.

Evaluation Metrics
This study uses the Root Mean Square Error (RMSE) as the loss function and the Mean Absolute Error (MAE) and the Correlation Coefficient (CC) to evaluate the various performance measures.These indicators are measurements of the anticipated value's departure from the actual data, and they indicate the prediction's overall inaccuracy.The corresponding definition of each indicator is given by the following as [34]: where y i and ỹi represent the real value and the forecasted value at the time t, N denotes the total time step, and ȳi and p are the average of the real value and the forecasted value, respectively.The smaller the values of RMSE and MAE, the smaller the deviation of the projected outcomes from the actual values.A value of CC closer to 1 indicates lower errors and a more accurate prediction.

Parameters Forecasting
We show in this research a forecast of the thermal characteristics of a smart house outfitted with various types of sensors.The fundamental architecture of LSTM networks is predetermined and immutable; each LSTM unit has a vector input of n values, including the current value of the specified parameters (CO 2 , noise, and temperature) at time t = 0 as well as the past values.We create three neural networks with various designs, each one adapted to the predicting parameter.After 10 min, these neural networks can forecast.We can anticipate the full period of the required horizon by repeating the process and selecting the appropriate parameters for these models.

CO 2 Forecasting
In the first experiment, we give the CO 2 prediction of a house for 12 h.Figures 9-11 show the predicted results obtained by the LSTM, the GA-LSTM, and the PSO-LSTM algorithms, respectively.As shown, the predicted results are closer to the real data values and the RMSE of each technique is quite low, which proves the forecasting performance of the suggested strategies.

Noise Forecasting
The second experiment also illustrates the noise prediction results for 12 h.Figures 12-14 show the findings with the error rate of the LSTM, the GA-LSTM, and the PSO-LSTM models.It appears that each model's curve prediction retains the shape of the real data curve.

Temperature Forecasting
The third experiment shows the temperature forecasted results for 12 h.

Analysis of Results
This work basically assesses the performance of the suggested model from two angles: precision and running time.Tables 1-3 provide the various performance measures for testing predictions on the studied building.
We can see that the implemented approaches produce quite excellent results, and the predicted findings are precise and dependable.
Tables 1-3 reveal that the two performance metrics, RMSE and MAE, have small values.These predictions are fairly close and representative to the real data.The correlation coefficient (CC) is also very close to 1, which proves the high precision of the forecasting strategies.As indicated in the tables and figures of forecasting results, the simple LSTM model without optimisation gives the worst results compared with the GA-LSTM and the PSO-LSTM techniques.We emphasize that the experimental results of the CO 2 prediction show that the GA-LSTM outperforms the PSo-LSTM and the LSTM models with RMSEs of 0.0135, 0.0185, and 0.0281 and CCs of 99.80%, 99.62%, and 99.16% for GA-LSTM, PSO-LSTM, and LSTM, respectively.For noise and temperature prediction, the performance of the PSO-LSTM outperforms the GA-LSTM in terms of RMSE and CC.Overall, we have successfully shown that the proposed optimisation techniques (GA-LSTM and PSO-LSTM networks) may successfully extract relevant information from noisy human behavior data.
The statistical analysis of the obtained results shows that the proposed model tuned by the two evolutionary metaheuristic search algorithms (GA and PSO) provides more precise results than the benchmark LSTM model, whose parameters were established through limited experience and a discounted number of experiments.

Conclusions
In this work, we have proposed two optimised metaheuristic algorithms based on the LSTM architecture for dealing with occupancy forecasting in the context of smart buildings.The GA-LSTM and PSO-LSTM models give very satisfactory prediction results with a high level of precision and reliability compared with the LSTM forecasting results.The choice of these two methods (PSO and GA) is based on their reputation in literature.A comparison shows that the implementation of the two metaheuristic algorithms (GA and PSO) for the optimal configuration of occupancy forecasting derived an optimal LSTM model that performs significantly better than the benchmark models, including other machine learning approaches such as the basic LSTM model.The predicted values have been used to check the presence of residents and then control real electrical consumption.This was carried out to prove that the optimised LSTM can decrease power consumption, improve security, and maintain comfort for the occupants.A potential field for future research would be to perform thermal parameters forecasting, using recurrent neural networks, for various construction such as hospitals, hotels, and public establishments.It would be worthwhile to investigate whether a recurrent neural network can maintain such a high accuracy to forecast thermal features and room occupancy rates in a smart building.Thus, future studies will also focus on the deployment and integration of various hybrid optimisation algorithms in recurrent neural networks such as the LSTM model in order to select the best architecture, weights, and learning rate in order to achieve greater energy savings in the building energy management system.As a result, our findings provide a solid foundation for future research aimed at providing a more accurate assessment of building occupancy.Nonetheless, the current findings will provide a basis for occupancy prediction, which might be used to enhance our context-driven approaches for managing active building systems such as the HVAC, lighting, and shading systems.Again, a forecasting model for thermal characteristics and room occupancy rates with a low estimation error would help energy producers in making operational, tactical, and strategic decisions.Finally, better building load forecasting allows the implementation of the real-time management of smart buildings.

Figure 6 .Figure 7 .
Figure 6.Training and validation of the CO 2 data.

Figure 8 .
Figure 8. Training and validation of the temperature data.
depict the results with the RMSE value of the LSTM, the GA-LSTM, and the PSO-LSTM approaches.Likewise, each model's curve prediction looks to keep the form of the real data curve.

Table 1 .
Performance criteria of the CO 2 prediction.

Table 2 .
Performance criteria of the noise prediction.

Table 3 .
Performance criteria of the temperature prediction.