Multi-State Load Demand Forecasting Using Hybridized Support Vector Regression Integrated with Optimal Design of Off-Grid Energy Systems—A Metaheuristic Approach

: The prediction accuracy of support vector regression (SVR) is highly inﬂuenced by a kernel function. However, its performance suffers on large datasets, and this could be attributed to the computational limitations of kernel learning. To tackle this problem, this paper combines SVR with the emerging Harris hawks optimization (HHO) and particle swarm optimization (PSO) algorithms to form two hybrid SVR algorithms, SVR-HHO and SVR-PSO. Both the two proposed algorithms and traditional SVR were applied to load forecasting in four different states of Nigeria. The correlation coefﬁcient (R), coefﬁcient of determination (R 2 ), mean square error (MSE), root mean square error (RMSE), and mean absolute percentage error (MAPE) were used as indicators to evaluate the prediction accuracy of the algorithms. The results reveal that there is an increase in performance for both SVR-HHO and SVR-PSO over traditional SVR. SVR-HHO has the highest R 2 values of 0.9951, 0.8963, 0.9951, and 0.9313, the lowest MSE values of 0.0002, 0.0070, 0.0002, and 0.0080, and the lowest MAPE values of 0.1311, 0.1452, 0.0599, and 0.1817, respectively, for Kano, Abuja, Niger, and Lagos State. The results of SVR-HHO also prove more advantageous over SVR-PSO in all the states concerning load forecasting skills. This paper also designed a hybrid renewable energy system (HRES) that consists of solar photovoltaic (PV) panels, wind turbines, and batteries. As inputs, the system used solar radiation, temperature, wind speed, and the predicted load demands by SVR-HHO in all the states. The system was optimized by using the PSO algorithm to obtain the optimal conﬁguration of the HRES that will satisfy all constraints at the minimum cost. Lagos, respectively, in the veriﬁcation phase. The results lead to the conclusion that for both calibration and veriﬁcation, SVR-HHO, SVR-PSO, and SVR-M3 are capable of capturing the complex non-linear patterns between the load demand variables. the lowest MSE and MAPE values in all four states. The results show that both SVR-HHO and SVR-PSO are capable of predicting multi-state load demands. For optimal sizing, the PSO algorithm was used based on the predicted load demands by SVR-HHO in all four states to determine the best sizes and combinations of three generation systems (PV/wind/battery, PV/wind, and wind/battery systems), and the results were compared with the results found by GA. The simulation results reveal that, economically, the wind/battery system is the best system for meeting the load requirements of Kano. For Abuja and Niger, the best system for supplying the load demand is the PV/battery system, while the results of optimal sizing for Lagos show that the PV/wind/battery hybrid system is the best system for meeting the load demand requirements.


Introduction
Electricity is paramount for the socio-economic development of every nation [1]. It affects every part of human life such as education, entertainment, healthcare, and transport. However, this vital commodity remains a luxury for a significant portion of the world's population. A total of 1.1 billion people, representing 14% of the global population, had no access to electricity in 2016, according to the International Energy Agency [2]. The lack of electricity affects developing countries the most, especially those in sub-Saharan Africa (SSA) and Asia. Consequently, the lack of electricity has restricted the development of these countries. Most of the countries in these regions show disparities in electrification rates between urban and rural areas [3]. Such is the case in Nigeria, a sovereign country in West Africa.
With a population of over 190 million, Nigeria has a peak load demand of 17,700 MW [4]. In 2017, the country recorded an all-time peak electricity generation of 5074 MW generated (iii) design a hybrid renewable energy system (HRES) that consists of photovoltaics, wind turbines, and batteries; (iv) apply PSO based on the SVR predictor model with the highest prediction accuracy, in order to determine the optimal sizes of three generation systems (PV/wind/battery, PV/battery, and wind/battery systems) in all the states. The same procedure is carried out by GA, and the results of both algorithms are compared.
It is worthy of note that the choice of PSO is based on whether it solves some limitations of heuristics and evolutionary programming as an optimization technique. Some of the challenges encountered using several optimization techniques include algorithm complexity, numerical difficulties when the penalty is large, and the inability to identify a global optimum. The PSO technique, which combines social psychology principles in socio-cognition human agents and evolutionary computations, has been shown to be flexible, well balanced, and easy to apply, and to enhance and adapt to the global and local exploration abilities. Furthermore, PSO has some special properties such as it does not use the gradient of an objective function, and its utilization for continuous real numbers, which makes it comparatively more efficient than other techniques [11].
The rest of the paper is structured as follows: The SVR, HHO, PSO, and hybrid SVR algorithms and the proposed methodologies are introduced in Section 2. The system configuration, operation strategy, and mathematical models for the hybrid energy system are presented in Section 3. Section 4 defines the objective function of the optimal sizing problem subject to certain operational constraints. Section 5 presents the predicted load demands by single SVR and hybrid SVR algorithms, as well as optimal sizing results for three generation systems. Finally, Section 6 concludes this paper.

Methods
This study proposes a hybrid support vector regression (SVR) with Harris hawks optimization (HHO) for short-term load forecasting in remote areas located in the states of Kano, Abuja, Niger, and Lagos, Nigeria. Multi-state load forecasting is paramount, but considering the nature of the dataset obtained from these locations, creating a reliable predictive model is always difficult. For the development of the model, the dataset was randomly divided into two parts called the calibration set, and the verification set. The calibration set was used for the training process and model construction, while the verification set was used for examining the performance accuracy of each candidate model. The main motive for employing different optimization algorithms for models is attributed to the difficulty in understanding whether a specific algorithm is superior to the others in practice. Additionally, the process of selecting a more accurate algorithm is relatively tough as there exist numerous statistical matrices for the evaluation of models as well as three different approaches including several input combinations. All input data were collected, pre-processed, and normalized based on Equation (1). The normalization of the data was performed before the model training, which is usually conducted to increase the accuracy and speed of the model. This process is very important as it enhances the effectiveness of the predictive models. Computational models such as regression models and AI-based models are usually evaluated using different statistical indicators, which provide more information about each adopted approach, besides providing a clear and realistic impression of the performance of each predictive model separately. Furthermore, identifying the statistical measures that are effectively used to assess the performance of a certain model is also very significant in the selection of the best predictive model.
where y is the normalized data, x is the measured data, and x max and x min are the maximum and minimum values of the measured data, respectively. For effective modeling and forecasting, external or internal validation techniques are paramount. In AI-based models, validation is generally conducted to fit the model into the given dataset according to the employed indicators to achieve a reliable result for Processes 2021, 9,1166 5 of 30 the prediction of the unknown dataset. Due to problems such as overfitting, satisfactory calibration performance is not usually in agreement with the testing performance. In the validation approach, various kinds of validation methods can be utilized including cross-validation (also known as k-fold cross-validation), holdout, and leave-one-out. The holdout technique is generally regarded as the simplest form of k-fold, whereby the data are randomly categorized into two sets called the training/calibration data and the verification/testing data. One of the main advantages of this method is that in every single round, the training and validation sets are independent of one another. This leads to a sound foundation in model optimization. In this research, the data were divided into two parts as stated above: 70% for the training phase and 30% for the testing phase. The model combination is presented in Equation (2). It is important to note that other methods could be used for validating and portioning the data. The proposed overall model development flowchart is presented in Figure 1.

Support Vector Regression
The idea of support vector machines (SVMs), which are an observer-based learning technique, was first presented by Vapnik in 1995 [24], and they have recently been used effectively in various fields of engineering science such as hydrological modeling, where the first application was by [25]. SVMs are an effective learning technique based on the

Support Vector Regression
The idea of support vector machines (SVMs), which are an observer-based learning technique, was first presented by Vapnik in 1995 [24], and they have recently been used effectively in various fields of engineering science such as hydrological modeling, where the first application was by [25]. SVMs are an effective learning technique based on the theory of bounded optimization that uses the structural minimization principle. SVMs are generally split into: support vector classifiers (SVCs), dealing with classification problems, and SVR, performing regression analysis. The function of SVR is expressed as where w represents the weight vector assigned to each feature, φ is the transfer function, and b is a bias. To find an appropriate SVR function f (x), the regression problem can be stated as Minimize : 1 2 Subject to : where C is the penalty parameter, ξ i and ξ * i are the slack variables, and ε denotes the size of the tube and displays the optimization performance. The solution of the non-linear regression function can be obtained by applying the Lagrangian functions as follows: where K(x, x i ) is the kernel function; α i and α * i are dual variables; and α i and α * i are >0. There are many kernel functions, such as polynomial, sigmoid, linear, and radial basis functions (RBF). Among the kernel functions, the most popular kernel used in the literature is the RBF kernel [26]. Thus, the RBF kernel was adopted in this study. The RBF kernel used here is that defined by [26,27].
where γ is the kernel parameter. SVR performance is influenced by three parameters, C, γ, and ε. A detailed description of SVR and SVM is available in [28].

Harris Hawks Optimization
HHO is a novel algorithm based on Harris hawks' hunting process [29]. In recent years, the HHO algorithm has been effectively applied for optimizing many intricate engineering problems [30]. Usually, hawks operate alone, except for Harris hawks, which work together to hunt and chase. The HHO algorithm mimics Harris hawks' cooperative behavior and hunting style in nature. Four movements are included in the hunting style of HHO: tracing, encircling, approaching, and attacking. This hunting style is based on three main steps: the exploration capability, the transition from exploration to exploitation, and, finally, the exploitation phase ( Figure 2). The first stage is the exploration capability and is stated as where X a (t) is the average location of Harris hawks, N is the number of Harris hawks (search agent number), X(t + 1) is the location of the hawks in the following iteration t + 1, X i (t) is the current position of the Harris hawk at iteration t, X rand (t) is a randomly selected hawk, X rabbit (t) is the position of the rabbit (prey), LB and UB are the lower and upper bands, respectively, and q, r 1 , r 2 , r 3 , and r 4 are random values varying between 0 and 1.
OR PEER REVIEW 8 of 31

Particle Swarm Optimization
The PSO algorithm was introduced by Kennedy and Eberhart in 1995 [32]. It is a population-based search algorithm inspired by the social behavior and movement dynamics of animals. The initial intent of the particle swarm optimization concept was to graphically simulate animals' social behavior, such as bird flocking, to discover patterns that govern the ability of birds to fly synchronously, and to suddenly change direction with regrouping in an optimal formation. From this initial objective, the concept evolved into a simple and efficient optimization algorithm.
In PSO, the population is referred to as 'swarm', and individuals are referred to as 'particles'. Using the metaphor of well-known birds, the algorithm manipulates a population of individual members in the swarm to establish a region of the search space. Each member moves with a definite velocity within the search space and keeps track of its position where it has found itself. This information is communicated among all members within the swarm. The general principles of the PSO algorithm are explained below.
Suppose that the search space is -dimensional, and each particle has a position vector and a velocity vector . The position vector and the velocity vector of the particle in the -dimensional search space can be represented as = ( , , … , ) and = ( , , … , ), respectively. Each member has a memory of the best position in the search space where it has found itself ( ) and knows the best position found by all the members in the swarm ( ) . Let = , , … , and = ( , , … , ) be the best position of individual and all the individuals in the swarm, respectively. At each step, the velocity and position of the particle The second stage is the transition from exploration to exploitation when the hawk energy is reduced during hunting. The hunting escape energy € can be formulated as where E 0 is the initial energy during each progression (E 0 ∈ [−1, 1]) and T is the maximum iteration size. In the HHO algorithm, the state of the rabbit (prey) can be determined based on the direction of the contrast of E. In Equation (10), E stands for the rabbit energy. HHO can calculate the rabbit state dependent on the variation trend of E. The third stage is the exploitation phase, which is predominantly planned to enhance local solutions based on the latest solutions available. In this stage, based on the prey escape and hawk hunting, the hawks suddenly attack the prey identified in the past stage. The besiege type to catch the rabbit is chosen based on both E and X values; the hard one is taken when | E | < 0.5, and the soft one is taken when | E | ≥ 0.5. The HHO algorithm has four strategies to mimic the stage of the attack: soft besiege, soft besiege with rapid progressive dives, hard besiege, and hard besiege with rapid progressive dives ( Figure 2) [29,30].

Particle Swarm Optimization
The PSO algorithm was introduced by Kennedy and Eberhart in 1995 [32]. It is a population-based search algorithm inspired by the social behavior and movement dynamics of animals. The initial intent of the particle swarm optimization concept was to graphically simulate animals' social behavior, such as bird flocking, to discover patterns Processes 2021, 9, 1166 8 of 30 that govern the ability of birds to fly synchronously, and to suddenly change direction with regrouping in an optimal formation. From this initial objective, the concept evolved into a simple and efficient optimization algorithm.
In PSO, the population is referred to as 'swarm', and individuals are referred to as 'particles'. Using the metaphor of well-known birds, the algorithm manipulates a population of individual members in the swarm to establish a region of the search space. Each member moves with a definite velocity within the search space and keeps track of its position where it has found itself. This information is communicated among all members within the swarm. The general principles of the PSO algorithm are explained below.
Suppose that the search space is n-dimensional, and each particle has a position vector X i and a velocity vector V i . The position vector X i and the velocity vector V i of the i th particle in the n-dimensional search space can be represented as the swarm, respectively. At each step, the velocity and position of the i th particle will be updated according to the following equations: where V k+1 i is the velocity of individual i at iteration k + 1, V k i is the velocity of individual i at iteration k, ω is the inertia weight parameter, c 1 and c 2 are the cognitive and social parameters, respectively, r 1 and r 2 are random numbers between 0 and 1, X k i is the position of individual i at iteration k, Pbest k i is the best position of individual i at iteration k, and Gbest k is the best position of the group until iteration k.
The inertia weight parameter in Equation (11) is used to assess the impact of the preceding history of velocities on the succeeding ones. Therefore, ω shows a compromise between a wider and a local exploration ability of the swarm. In essence, a big inertia weight parameter encourages a wider exploration, while a small one encourages a local exploration. Experimental results suggest that it is preferable to initialize it to a large value, giving priority to the global exploration of the search space, gradually decreasing it in order to obtain a refined solution.

Hybrid SVR Algorithms
To enhance the performance of SVR, the parameters of the SVR model must be carefully defined. The strength of the SVR model depends on the accurate selection of these three parameters, C, γ, and ε [33]. However, these parameters with a wide range make the search space very large, as well as making it hard to select accurate parameters. Hence, this problem can be categorized as an optimization problem and needs to be resolved by using optimization techniques. The combination of the SVR model with HHO and PSO, which are nature-inspired algorithms, resulted in the following hybrid models: SVR-HHO and SVR-PSO, respectively. The nature-inspired algorithms were used to select the SVR model parameters, C, γ, and ε. The flowchart of the hybrid SVR-HHO and SVR-PSO models developed in this study is illustrated in Figure 3.
total of 70% of the data were used for training in the present research, while the remaining 30% were used for testing. To provide an unbiased estimation of model performance, model performance analysis was conducted using the testing dataset. At various states, the availability of data was different. Different periods were therefore used for the training and validation of the models at the various states. Tables 1 and 2 provide the data periods used for model training and testing at the various states.  The data values were divided into two sets for the development of prediction models (i.e., SVR-HHO, and SVR-PSO): one set for model training, and another set for testing. A total of 70% of the data were used for training in the present research, while the remaining 30% were used for testing. To provide an unbiased estimation of model performance, model performance analysis was conducted using the testing dataset. At various states, the availability of data was different. Different periods were therefore used for the training and validation of the models at the various states. Tables 1 and 2 provide the data periods used for model training and testing at the various states.

Evaluation Criteria of the Models
Both the two proposed algorithms and the traditional SVR models were assessed using several performance evaluation criteria including the correlation coefficient (R), coefficient of determination (R 2 ), mean square error (MSE), root mean square error (RMSE), and mean absolute percentage error (MAPE). The R 2 indicates the discrepancy between the observed and predicted values; for the best network, the R 2 value should be high both in training and validation. The higher the R 2 , the better the prediction. Its value ranges between −∞ and 1, with higher values indicating better agreement. The R is a significant factor that can be utilized to efficiently discover the robustness of the relationship between predicted and simulated data points. Relative errors (MSE, RMSE, and MAPE) can be expressed by assuming the absolute errors divided by the number of total observations. This indicator is widely used in science and engineering sectors to assess predictive models because it provides significant information on how closely the simulated data points match actual ones. Percentage errors (MAPE) have the advantage of being scale-independent, meaning they are frequently used to compare the forecast performance between different data series [34]. RMSE is a statistical parameter often used to compare the forecasting errors of several models. Lower RMSE values usually point to better predictions. The selection process of the best predictive model is of great importance to achieve highly accurate predictions. Therefore, in this study, five statistical parameters were used to assess the performances of each predictive approach. These quantitative parameters are where N, o obs i , o obs , and o com i are data number, observed data, averaged value of the observed data, and computed values, respectively.

Sizing Formulation
The schematic description of the off-grid solar PV/wind/battery HRES is shown in Figure 4. The HRES consists of solar photovoltaic panels (PVs), wind turbines (WTs), and a battery bank to store excess energy and improve system reliability. In this system, when the power generated by the renewable energy sources (P Gen ) at time t matches the load demand (P Load ), the total power generated is injected into the load through an inverter. When the power generated is higher than the load demand, excess energy will be stored in the battery bank. Conversely, when the load demand is higher than the power generated, the storage system will come on board to supply the difference. In a situation whereby the load demand exceeds the power generated by the renewable energy sources plus that supplied by the storage system, some fraction of the load must be shed, leading to a loss of load. A DC-coupled HRES configuration was considered in this study because it is easy to implement and more efficient and economic, and it eliminates the need for frequency and voltage control of individual sources connected to the bus [35]. Processes 2021, 9, x FOR PEER REVIEW 11 of 31

Photovoltaic System
The output power of a PV panel ( ) in watts (W) at time t can be obtained from the available solar radiation using the following equation: where is the rated power of the PV panel (W), is the solar radiation (W/m 2 ), is the temperature coefficient (%/°C), and is the PV panel temperature (°C). If the number of PV panels is , the overall produced power is ( ) = × ( ).

Wind Turbine
For a wind turbine, wind speed is the most important factor. When the wind speed equals or exceeds the cut-in value, the wind turbine generator will start generating power. If the wind speed exceeds the cut-out value, the wind turbine generator stops running to protect the generator. The produced power of a wind turbine ( ) at time t is obtained as follows: where ( ) is the wind speed at time t, is the rated power of the wind turbine, and , , and are cut-in, cut-out, and rated wind speed of the turbine, respectively. If the number of wind turbines is , the overall produced power is ( ) = × ( ).

Battery
The purpose of using a battery is for energy storage, and it is utilized to cover the difference between supply (from PV and wind turbines) and load demand. The input

Photovoltaic System
The output power of a PV panel (p PV ) in watts (W) at time t can be obtained from the available solar radiation using the following equation: where Y PV is the rated power of the PV panel (W), I is the solar radiation (W/m 2 ), α P is the temperature coefficient (%/ • C), and T C is the PV panel temperature ( • C). If the number of PV panels is N PV , the overall produced power is P PV (t) = N PV × p PV (t).

Wind Turbine
For a wind turbine, wind speed is the most important factor. When the wind speed equals or exceeds the cut-in value, the wind turbine generator will start generating power. If the wind speed exceeds the cut-out value, the wind turbine generator stops running to protect the generator. The produced power of a wind turbine (p WT ) at time t is obtained as follows: where v(t) is the wind speed at time t, P r is the rated power of the wind turbine, and V ci , V co , and V r are cut-in, cut-out, and rated wind speed of the turbine, respectively. If the number of wind turbines is N WT , the overall produced power is P WT (t) = N WT × p WT (t).

Battery
The purpose of using a battery is for energy storage, and it is utilized to cover the difference between supply (from PV and wind turbines) and load demand. The input power of the battery can be negative or positive due to the charging and discharging process. The state of charge (SOC) of a battery with the corresponding productivity and time consumption is obtained as follows: When P PV (t) + P WT (t) = P Load (t) n Inv , the capacity of the battery is stable and will not change. When the total output power of PV panels and wind turbines is greater than the load demand, P PV (t) + P WT (t) > P Load (t) n Inv , the battery bank is in a charging state. The charge quantity of the battery at time t can be obtained as where SOC(t) and SOC(t − 1) are the charge quantities of the battery bank at time t and t − 1, σ is the hourly self-discharge rate, n Inv is the inverter efficiency, P Load (t) is the load demand for a particular hour, and n BC is the charging efficiency of the battery bank. When the total output power of PV panels and wind turbines is less than the load demand, P PV (t) + P WT (t) < P Load (t) n Inv , the battery bank is in a discharging state. The charge quantity of the battery at time t can be obtained as where n BD is the discharging efficiency of the battery bank.

Converters/Inverters
Two DC/DC converters, one AC/DC converter, and one DC/AC inverter were used for the power conversion in the proposed HRES. The efficiency of all these power conversion devices was set at 95%. The inverter's losses are a function of its efficiency which is used to relate DC load input to AC load output as follows: where P Inv−Load (t), P Gen−Inv (t), and P Bat−Inv (t) are the total power injected into the load (AC), power transferred from the renewable energy sources (DC), and the battery (DC) at time t, respectively.

Reliability
Power reliability analysis plays an important role in the hybrid energy system design process because of the intermittent generation characteristics of PV and WT. When the power supply by the HRES is less than the demand, this places the hybrid energy system into a loss of power supply (LPS) scenario, expressed as The available power from the HRES at a time, P avail (t), can be obtained by Equation (24).
There are many techniques for carrying out reliability analysis, the most prominent among them being the loss of power supply probability (LPSP) technique, defined by Equation (25).
where T is the number of hours in the study. An LPSP of 0 means that the load demand will always be satisfied, and an LPSP of 1 means that the load demand will never be satisfied.

Objective Function
The objective function of this optimal sizing problem is to minimize the total annual cost (C TA ) of the system. The (C TA ) consists of the annual capital cost (C AC ) and the annual maintenance cost (C AM ). To optimally size the off-grid hybrid energy system, the optimization problem defined by Equation (26) must be solved by using an optimization technique. Minimize The capital cost occurs at the beginning of a project, while the maintenance cost occurs during the project life.
In order to convert the initial capital cost to the annual capital cost, the capital recovery factor (CRF), defined by Equation (27), was used. The capital recovery factor is a ratio used to calculate the present value of an annuity (a series of equal annual cash flows).
where (i) is the interest rate and (n) denotes the life span of the system in years.
The lifetime of the proposed hybrid renewable energy system project was considered to be 20 years. During the project's lifetime, some components of the PV/wind/battery system need to be replaced. The lifetime of a battery was assumed to be 5 years. By using the single payment present worth factor, we have where C Bat is the present worth of the battery and P Bat is the battery price. Similarly, the lifetime of the converter/inverter was assumed to be 10 years. By using the single payment worth factor, we have where C Conv/Inv is the present worth of the converter/inverter and P Conv/Inv is the price of the converter/inverter. For this system, the total annual capital and maintenance costs were obtained by Equations (30) and (31), respectively. (30) where N PV is the number of PV panels, C PV is the unit cost of a PV panel, N WT is the number of wind turbines, C WT is the unit cost of a wind turbine, N Bat is the number of batteries, C Bat is the present worth of a battery, N Conv/Inv is the number of converters/inverters, and C Conv/Inv is the present worth of converters/inverters.
where C PV−Mtn and C WT−Mtn are the annual maintenance costs of a PV panel and wind turbine, respectively. The maintenance costs of batteries and converter/inverter systems were ignored.

Constraints
For the hybrid PV/wind/battery-based system, the following constraints should be satisfied: where N PV−Max , N WT−Max , and N Bat−Max are the maximum number of PV panels, wind turbines, and batteries, respectively. At any time, the charge quantity of the battery bank should satisfy the constraint of SOC(t min ) ≤ SOC(t) ≤ SOC(t max ).
The maximum charge quantity of the battery bank SOC(t max ) takes the value of the nominal capacity of the battery bank (S Bat ), and the minimum charge quantity of the battery bank SOC(t min ) is obtained by the maximum depth of discharge (DOD).

Results and Discussion
The main objective of this study was to develop hybrid SVR algorithms (SVR-HHO, and SVR-PSO) and compare their effectiveness in predicting the load demand variability of remote areas located in Kano, Abuja, Niger, and Lagos State of Nigeria (a multi-state approach). For this aim, in each state, the load profile of a remote block of flats consisting of 10 apartments was obtained through a survey. The number of electrical appliances, their ratings, and time of usage for all the apartments were considered. The hourly load data were computed by physical monitoring and recording of an hourly variation in the unrestricted load in a 24-h window. The bar plot of the raw data for the four remote locations is presented in Figure 5. Therefore, this section presents the results obtained in both quantitative and visualized forms. Prior to the model simulation, preprocessing of the data was carried out using various methods including normalization, and reliability analysis. For computational analysis (time series data), understanding the effect of individual inputs is crucial in determining the robustness of the predictive models. Hence, stationary and reliability analysis was conducted for all the study areas using the Cronbach's alpha method and unit root test (i.e., using augmented Dickey-Fuller (ADF)). It should be noted that for any time series, the preliminary analysis of a single parameter or input is quite significant for the reason that its prediction accuracy could potentially add to the performance efficiency of the models. As reported in [36,37], the internal consistency of the parameter can have a positive impact if the Cronbach's alpha values exceed the threshold of 0.7. According to [38], in order to obtain reliable and valid outcomes that safeguard the stationarity of all the parameters, the ADF test is paramount. The experimental data used in the present work satisfied the aforementioned criteria.
Although the selection of input variable combination is effective using linear sensitivity analysis, many studies such as [39][40][41][42] criticized the conventional method of using Pearson correlation as the input selection approach and recommended the application of complex chaotic systems such as non-linear kernels, GA, and kernel-PCA. In this study, both linear and non-linear sensitivity analyses were tested and verified. From the analysis, it was observed that there is good agreement between the two approaches. The determination of model combinations using linear and non-linear approaches has been reported in several studies [43][44][45][46]. The correlation matrix of the four states ( Figure 6) shows that the most effective variable with the target value follows the following hierarchy: ambient temperature ( • C) > wind speed (m/s) > solar radiation (W/m 2 ) for Kano, Abuja, and Niger. The hierarchy order for Lagos is: wind speed (m/s) > ambient temperature ( • C) > solar radiation (W/m 2 ). A strong correlation for temperature was observed for Lagos, Kano, Niger, and Abuja, with 0.72, 0.71, 0.67, and 0.61, respectively. Similarly, a good to marginal correlation was attained by wind speed, with values of 0.53, 0.51, 0.39, and 0.83 for Kano, Abuja, Niger, and Lagos, respectively. However, a marginal to fair correlation coefficient was observed by solar radiation, with values of 0.37, 0.42, 0.28, and 0.29 for Kano, Abuja, Niger, and Lagos, respectively.  Although the selection of input variable combination is effective using linear sensitivity analysis, many studies such as [39][40][41][42] criticized the conventional method of using Pearson correlation as the input selection approach and recommended the application of complex chaotic systems such as non-linear kernels, GA, and kernel-PCA. In this study, both linear and non-linear sensitivity analyses were tested and verified. From the analysis, it was observed that there is good agreement between the two approaches. The determination of model combinations using linear and non-linear approaches has been reported in several studies [43][44][45][46]. The correlation matrix of the four states ( Figure 6) shows that the most effective variable with the target value follows the following hierarchy: ambient The hierarchy order for Lagos is: wind speed (m/s) > ambient temperature (°C) > solar radiation (W/m 2 ). A strong correlation for temperature was observed for Lagos, Kano, Niger, and Abuja, with 0.72, 0.71, 0.67, and 0.61, respectively. Similarly, a good to marginal correlation was attained by wind speed, with values of 0.53, 0.51, 0.39, and 0.83 for Kano, Abuja, Niger, and Lagos, respectively. However, a marginal to fair correlation coefficient was observed by solar radiation, with values of 0.37, 0.42, 0.28, and 0.29 for Kano, Abuja, Niger, and Lagos, respectively.

Results of Machine Learning and Metaheuristic Algorithms
The simulation was conducted in MATLAB 9.3 (R2020a). The best architecture of the SVR model was optimized and selected through the use of the trial-and-error method. According to suggestions in the literature [47][48][49][50], a qualified model is one that meets the requirements of most statistical evaluation criteria. The model simulation was evaluated using the most utilized performance criteria including R 2 , MSE, RMSE, R, and MAPE in both calibration and verification. Based on model combinations (Equation (2)), the simulated results in terms of quantitative assessment are presented in Table 1. This table also displays the multi-state performance of the models based on different input variables for both calibration and verification phases. From the results, it can be observed that the predictive modeling approaches achieved different adequacies in accordance with evaluation criteria. Additionally, the overall multi-state results demonstrate that SVR-M3 served as the best simulation in terms of R 2 , MSE, RMSE, R, and MAPE. Although it is difficult to

Results of Machine Learning and Metaheuristic Algorithms
The simulation was conducted in MATLAB 9.3 (R2020a). The best architecture of the SVR model was optimized and selected through the use of the trial-and-error method. According to suggestions in the literature [47][48][49][50], a qualified model is one that meets the requirements of most statistical evaluation criteria. The model simulation was evaluated using the most utilized performance criteria including R 2 , MSE, RMSE, R, and MAPE in both calibration and verification. Based on model combinations (Equation (2)), the simulated results in terms of quantitative assessment are presented in Table 1. This table also displays the multi-state performance of the models based on different input variables for both calibration and verification phases. From the results, it can be observed that the predictive modeling approaches achieved different adequacies in accordance with evaluation criteria. Additionally, the overall multi-state results demonstrate that SVR-M3 served as the best simulation in terms of R 2 , MSE, RMSE, R, and MAPE. Although it is difficult to rank the models in accordance with the achieved accuracies, the SVR-M3 approach relatively showed the best prediction accuracy, with Kano and Niger attaining more than 90%, while Abuja displayed more than 79% with regard to the goodness of fit. The comparative visualization of the three model combinations for multi-states is presented in the scatter plots of Figure 7. A scatter plot shows the level of agreement between the observed and predicted loads for the overall goodness of fit. It is obvious from the scatter plot that the SVR-M3 model shows a higher accuracy in comparison to the SVR-M1 and SVR-M2 models. Further analysis can be conducted using a dimensional radar diagram for the prediction of load demand, which is depicted in Figure 8. It can be observed that SVR-M3 > SVR-M2 > SVR-M1 models were able to capture the best fitting pattern for Kano (R 2 > 0.9), Abuja, and Lagos (R 2 > 0.73), while SVR-M3 > SVR-M1 > SVR-M2 models were able to capture this for Niger (R 2 > 0.97). It is essential to note that the promising ability of SVR-M3 could be attributed to the fact that three variables were engaged in the prediction (temperature, wind, and solar radiation). That means this component includes all useful parameters that have an effective impact on load demand prediction. Similarly, the numerical comparison of multi-states with regard to MSE indicated that SVR-M3 (MSE = 0.0052) is superior to SVR-M1 (MSE = 0.0589), and SVR-M2 (MSE = 0.0384). The predictive error of the best model reduced by 2% on average for Kano. For Abuja, the simulated error of SVR-M3 reduced by 6% on average for both SVR-M1 and SVR-M2, and, lastly, the predictive error of the best model decreased by 14% for Niger and 26% for Lagos.
Processes 2021, 9, x FOR PEER REVIEW 18 of 31 rank the models in accordance with the achieved accuracies, the SVR-M3 approach relatively showed the best prediction accuracy, with Kano and Niger attaining more than 90%, while Abuja displayed more than 79% with regard to the goodness of fit. The comparative visualization of the three model combinations for multi-states is presented in the scatter plots of Figure 7. A scatter plot shows the level of agreement between the observed and predicted loads for the overall goodness of fit. It is obvious from the scatter plot that the SVR-M3 model shows a higher accuracy in comparison to the SVR-M1 and SVR-M2 models.   Further analysis can be conducted using a dimensional radar diagram for the prediction of load demand, which is depicted in Figure 8. It can be observed that SVR-M3 > SVR-M2 > SVR-M1 models were able to capture the best fitting pattern for Kano (R 2 > 0.9), Abuja, and Lagos (R 2 > 0.73), while SVR-M3 > SVR-M1 > SVR-M2 models were able to capture this for Niger (R 2 > 0.97). It is essential to note that the promising ability of SVR-M3 could be attributed to the fact that three variables were engaged in the prediction (temperature, wind, and solar radiation). That means this component includes all useful parameters that have an effective impact on load demand prediction. Similarly, the numerical comparison of multi-states with regard to MSE indicated that SVR-M3 (MSE = 0.0052) is superior to SVR-M1 (MSE = 0.0589), and SVR-M2 (MSE = 0.0384). The predictive error of the best model reduced by 2% on average for Kano. For Abuja, the simulated error of SVR-M3 reduced by 6% on average for both SVR-M1 and SVR-M2, and, lastly, the predictive error of the best model decreased by 14% for Niger and 26% for Lagos.  Further analysis can be conducted using a dimensional radar diagram for the prediction of load demand, which is depicted in Figure 8. It can be observed that SVR-M3 > SVR-M2 > SVR-M1 models were able to capture the best fitting pattern for Kano (R 2 > 0.9), Abuja, and Lagos (R 2 > 0.73), while SVR-M3 > SVR-M1 > SVR-M2 models were able to capture this for Niger (R 2 > 0.97). It is essential to note that the promising ability of SVR-M3 could be attributed to the fact that three variables were engaged in the prediction (temperature, wind, and solar radiation). That means this component includes all useful parameters that have an effective impact on load demand prediction. Similarly, the numerical comparison of multi-states with regard to MSE indicated that SVR-M3 (MSE = 0.0052) is superior to SVR-M1 (MSE = 0.0589), and SVR-M2 (MSE = 0.0384). The predictive error of the best model reduced by 2% on average for Kano. For Abuja, the simulated error of SVR-M3 reduced by 6% on average for both SVR-M1 and SVR-M2, and, lastly, the predictive error of the best model decreased by 14% for Niger and 26% for Lagos. However, the overall accuracy of Abuja was unsatisfactory and recorded high uncertainty with regard to errors regardless of considering the non-linear relationship between predictors and their corresponding targets. This could be improved by introducing optimization algorithms (PSO and HHO). Basically, it should be taken into consideration that the promising estimations took place during the calibration phase which is primarily em- However, the overall accuracy of Abuja was unsatisfactory and recorded high uncertainty with regard to errors regardless of considering the non-linear relationship between predictors and their corresponding targets. This could be improved by introducing optimization algorithms (PSO and HHO). Basically, it should be taken into consideration that the promising estimations took place during the calibration phase which is primarily employed to effectively calibrate the models based on known input variables and targets. However, the testing step is vital in assessing the performance of a model since it examines the model's accuracy based on unseen target values. This benefit does not exist in the training set. Therefore, a reliable model should have a stable and balanced performance in both calibration and verification phases. The section of optimization algorithms generally depicted a promising ability with regard to the single model. This is not surprising, as it can be seen in many studies [51][52][53][54][55]. As with the single model, the simulation performance of the metaheuristic algorithms was evaluated using R 2 , MSE, RMSE, R, and MAPE for consistency. Table 2 presents the results of SVR-PSO and SVR-HHO for both training and verification phases. From Table 2, SVR-HHO has the highest R 2 values of 0.9951, 0.8963, 0.9951, and 0.9313, the lowest MSE values of 0.0002, 0.0070, 0.0002, and 0.0080, and the lowest MAPE values of 0.1311, 0.1452, 0.0599, and 0.1817, respectively, for Kano, Abuja, Niger, and Lagos. The results prove that SVR-HHO outperformed SVR-PSO for the simulation of load demand at Kano, Abuja, Niger, and Lagos. The predictive ability of HHO over others is consistent with findings from the works of [29,31,56]. The exploratory analysis for the SVR-PSO and SVR-HHO models can also be better understood through the boxplots in Figure 9. As indicated in Figure 9, the computed models that mimic the pattern of the observed values are selected as the best performing model based on the box and whisker evolution criteria. The degree and trend patterns between the measured and computed models show that SVR-HHO > SVR-PSO > SVR-M3 ranked as the best model among all the models considered. However, considering the other evaluation performance indices, the simulated outcomes justified a high level of superiority of all the metaheuristics algorithms despite the better predictive skill shown by SVR-M3. Figure 10 shows the scatter plots between the observed and predicted values of SVR-PSO and SVR-HHO for Kano, Abuja, Niger, and Lagos. Close agreement between the observed and predicted values was attained in SVR-HHO compared to SVR-PSO. Moreover, the R values of all the models were found to be greater than 0.9, which agrees with conclusions reached by [57][58][59] that R values higher than 0.70 are considered acceptable; thus, the results of all optimization algorithm models are acceptable (see Table 2). Processes 2021, 9, x FOR PEER REVIEW 21 of 31 However, considering the other evaluation performance indices, the simulated outcomes justified a high level of superiority of all the metaheuristics algorithms despite the better predictive skill shown by SVR-M3. Figure 10 shows the scatter plots between the observed and predicted values of SVR-PSO and SVR-HHO for Kano, Abuja, Niger, and Lagos. Close agreement between the observed and predicted values was attained in SVR-HHO compared to SVR-PSO. Moreover, the R values of all the models were found to be greater than 0.9, which agrees with conclusions reached by [57][58][59] that R values higher than 0.70 are considered acceptable; thus, the results of all optimization algorithm models are acceptable (see Table 2). An overall comparison between the best traditional SVR model and the two hybrid SVR algorithms (SVR-HHO and SVR-PSO) is provided using a two-dimensional Taylor diagram, as shown in Figure 11. The Taylor diagram highlights and summarizes several statistical indices such as the correlation coefficient (R), RMSE, and standard deviation between the observed and predicted values [60]. To the best of the authors' knowledge, this research serves as the first to use this diagram in forecasting load demand. This diagram can highlight the goodness of fit of various models in comparison with one another; therefore, the diagram can be seen as a series of points on a polar plot. A detailed explanation and discussion on the Taylor diagram can be found in [61]. From Figure 11  An overall comparison between the best traditional SVR model and the two hybrid SVR algorithms (SVR-HHO and SVR-PSO) is provided using a two-dimensional Taylor diagram, as shown in Figure 11. The Taylor diagram highlights and summarizes several statistical indices such as the correlation coefficient (R), RMSE, and standard deviation between the observed and predicted values [60]. To the best of the authors' knowledge, nation and discussion on the Taylor diagram can be found in [61]. From Figure 11, it can be observed that SVR-HHO achieved better goodness of fit in all four states with the values of R = 0.9975, R = 0.9467, R = 0.9976, and R = 0.9650 for Kano, Abuja, Niger, and Lagos, respectively, in the verification phase. The results lead to the conclusion that for both calibration and verification, SVR-HHO, SVR-PSO, and SVR-M3 are capable of capturing the complex non-linear patterns between the load demand variables. The simulated results can be better visualized using a time series graph, as presented in Figure 12. From these figures, the optimal predicted load is attributed to the SVR-HHO algorithm in all four states, despite the simultaneous reliability of SVR-PSO. The simulated results can be better visualized using a time series graph, as presented in Figure 12. From these figures, the optimal predicted load is attributed to the SVR-HHO algorithm in all four states, despite the simultaneous reliability of SVR-PSO. The simulated results can be better visualized using a time series graph, as presented in Figure 12. From these figures, the optimal predicted load is attributed to the SVR-HHO algorithm in all four states, despite the simultaneous reliability of SVR-PSO. Despite the availability of large metaheuristic algorithms, none of them can assure a consistent optimal performance in solving different kinds of problems [62,63]. However, recent research on a novel population-based, nature-inspired optimization paradigm (HHO) algorithm showed better effectiveness in discovering optimal solutions for higher and multi-dimensional problems [64][65][66][67]. According to [68,69], the novel HHO optimizer has been checked, through a comparison with other nature-inspired techniques, on 29 benchmark problems and several real-world engineering problems. The statistical results and comparisons show that the HHO algorithm provides very promising and occasionally competitive results compared to well-established metaheuristic techniques such as GA, BBO, GWO, BAT, and FPA.

Results and Cost Analysis for Optimal Sizing
For finding the optimal sizing, the stand-alone PV/wind/battery HRES was modeled and run in the MATLAB environment. As input, the model used data of solar irradiance, temperature, wind speed, and the forecasted load demand by the novel SVR-HHO, and PSO and GA were used as optimizers. The component parameters are presented in Table  3. The parameter settings of PSO and GA were as follows: PSO: = 50, = 2, = 2, = 0.9, = 100; GA: = 50, = 0.25, = 0.7, = 100. PSO and GA attempt to find the optimal number of decision variables. The minimum and maximum values of the decision variables were set to 0 and 10,000, respectively. In the beginning, the charge quantity of each battery was assumed to be 30% of its nominal capacity. Despite the availability of large metaheuristic algorithms, none of them can assure a consistent optimal performance in solving different kinds of problems [62,63]. However, recent research on a novel population-based, nature-inspired optimization paradigm (HHO) algorithm showed better effectiveness in discovering optimal solutions for higher and multi-dimensional problems [64][65][66][67]. According to [68,69], the novel HHO optimizer has been checked, through a comparison with other nature-inspired techniques, on 29 benchmark problems and several real-world engineering problems. The statistical results and comparisons show that the HHO algorithm provides very promising and occasionally competitive results compared to well-established metaheuristic techniques such as GA, BBO, GWO, BAT, and FPA.

Results and Cost Analysis for Optimal Sizing
For finding the optimal sizing, the stand-alone PV/wind/battery HRES was modeled and run in the MATLAB environment. As input, the model used data of solar irradiance, temperature, wind speed, and the forecasted load demand by the novel SVR-HHO, and PSO and GA were used as optimizers. The component parameters are presented in Table 3. The parameter settings of PSO and GA were as follows: PSO: N = 50, c 1 = 2, c 2 = 2, ω = 0.9, iter max = 100; GA: N = 50, m = 0.25, c = 0.7, iter max = 100. PSO and GA attempt to find the optimal number of decision variables. The minimum and maximum values of the decision variables were set to 0 and 10,000, respectively. In the beginning, the charge quantity of each battery was assumed to be 30% of its nominal capacity. The optimal results obtained by PSO and GA for Kano, Abuja, Niger, and Lagos are shown in Tables 4-7. Table 4 summarizes the results of optimal sizing for Kano. In this table, the optimal sizes of PV/wind/battery, PV/battery, and wind/battery systems are indicated. From an economic point of view, the wind/battery system is the best system for meeting the load demand requirements of Kano. The total annual cost of the wind/battery system is USD 11,448.63, and the optimal number of decision variables is 40 wind turbines and 42 batteries.  Table 5 shows the optimal number of decision variables and total annual costs of the systems for Abuja. It can be observed that the best system for supplying the load demand is the PV/battery system, with a cost of USD 11,294.83. The optimal sizes of the system components are 134 PV panels and 36 batteries. Table 6 presents the optimal number of each component and total annual costs for the hybrid system in Niger. In this table, the most cost-effective system is the PV/battery system, with the lowest cost of USD 6621.80. The optimal number of system components is 71 PV panels and 26 batteries. Simulation results reveal that is it not economically viable to have a wind/battery system in Niger due to poor wind speeds.
Finally, the results of optimal sizing for Lagos are presented in Table 7. In Lagos, the PV/wind/battery hybrid system is the best system for meeting the load demand requirements. The total annual cost of the hybrid system is USD 16,552.4, and the optimal number of decision variables is 56 PV panels, 42 wind turbines, and 57 batteries. From Tables 4-7, it can be observed that PSO produces more promising results than GA in all four states.

Conclusions
A methodology for multi-state load demand forecasting and optimal sizing of a standalone PV/wind/battery hybrid energy system was presented in this paper. A prediction approach based on SVR was used to predict the load demand variability in four different states in Nigeria. Knowing the limitations of traditional SVR, two hybrid SVR algorithms (SVR-HHO, and SVR-PSO) were developed to compare the feasibility of predicting the load demand. Three input attributes, M1, M2, and M3, were defined for the traditional SVR model. SVR-M1, SVR-M2, and SVR-M3 were compared based on input attributes' sensitivity analysis. SVR-M3 was found to outperform SVR-M1 and SVR-M2 in terms of prediction accuracy. For the hybrid SVR algorithms, there was an increase in performance for both SVR-HHO and SVR-SPO over SVR-M3. SVR-HHO had the highest R 2 values and the lowest MSE and MAPE values in all four states. The results show that both SVR-HHO and SVR-PSO are capable of predicting multi-state load demands.
For optimal sizing, the PSO algorithm was used based on the predicted load demands by SVR-HHO in all four states to determine the best sizes and combinations of three generation systems (PV/wind/battery, PV/wind, and wind/battery systems), and the results were compared with the results found by GA. The simulation results reveal that, economically, the wind/battery system is the best system for meeting the load requirements of Kano. For Abuja and Niger, the best system for supplying the load demand is the PV/battery system, while the results of optimal sizing for Lagos show that the PV/wind/battery hybrid system is the best system for meeting the load demand requirements.