1. Introduction
The production of electricity using traditional methods of processing fossil and synthetic fuels contributes to high levels of pollution of the Earth’s atmosphere and causes a noticeable increase in the incidence of civilizational diseases such as cancer [
1]. Currently, to reduce air pollution and CO
2 emissions, renewable energy sources are used to generate electricity. However, to deal with the instability of these power sources, due to the reliance on weather, smart control systems are needed to manage power generation and ensure satisfactory quality, especially in off-grid systems. The ongoing challenge is to design and apply a smart grid that can operate an off-grid power system under weather fluctuations in real time. Many different types of forecasting models have been developed, e.g., in [
2], a fuzzy model was proposed using particle swarm optimization (PSO) to forecast solar and wind power generation. Another study applied a novel Bayesian ensembling model for wind power prediction [
3]. In [
4], an ANN was used to forecast wind speed. In [
5], an optimized deep learning-based long short-term memory (LSTM) model was applied for power generation and consumption forecasting. The advantages of the model proposed in this study are outlined as follows: (1) The model uses the steps of each target output, and (2) the dataset was published for free. Moreover, the authors of [
6] used an unconventional strategy for estimating sustainable energy output using a Neural Network (NN) and Convolutional Neural Network (CNN) to build a framework that can precisely predict offshore wind and PV energy production in the short term. In this respect, a power quality parameter forecasting process plays a vital role in smart grids, as it helps to generate power within satisfactory limits from renewable energy sources. Among numerous PQPs that define the quality of the power, the most important are the power frequency, total harmonic distortion of voltage (
) [
7], total harmonic distortion of current (
), short-term flicker severity (
), and amplitude of the power voltage (
U).
In a small-scale off-grid system, the power quality parameters (PQPs) should be forecasted to determine their future values according to weather conditions and home appliance operations. Once PQPs are predicted successfully, the next task is to reschedule the run time of the home appliances to an appropriate time, which should ensure a balance between generating power from renewable sources and power demand. The forecasted PQPs need to be constantly compared to standard values, not only to reschedule the load to meet the availability of the generated power but also to avoid damage to users’ devices. For example, Zjavka applied and compared three methods for PQPs in [
8] and combined machine learning and regression models in [
9]. Vantuch et al. used a random decision forest optimized by multi-objective optimization [
10]. Stuchly et al. tested an ANN with a backpropagation learning algorithm [
11]. Jahan et al. investigated a standard regression tree [
12,
13], linear regression, interaction linear regression, an ANN, quadratic linear regression, pure quadratic linear regression, bagging DT, and boosting DT [
14]. In Indonesia [
15], researchers applied Random Forest (RF) and a Poly-Exponential (PE) model to forecast power load and power quality. The results proved that the model requires fewer samples and yields a precise prediction. Quantile Regression (QR) models were applied with Principal Component Analysis (PCA) for forecasting of the power quality index level [
16]. PCA was used for dimension reduction, and the numerical results confirmed that the model is suitable for PQ forecasting for both comprehensive indices and individual components. A forecasting system to predict voltage deviation was investigated in [
17]. The proposed approach combines the following techniques: PCA for dimension reduction to reduce the input data dimension, affinity propagation to cluster the input data, and a BP neural network to estimate the voltage deviation. The model achieved good forecasting results compared with others.
As a practical application, the authors of [
18] used data from a small number of smart meters to predict the voltage total harmonic distortion (
) for low-voltage busbars of residential distribution feeders. The technique gives the system operators access to pertinent power quality indicators by using the current monitoring infrastructure. In the work, several voltage total harmonic distortion forecasting techniques, including artificial neural networks, were evaluated. In [
19], advanced Fuzzy Time Series (FTS) were applied for prediction of power quality events.
The experiment focused on interruptions and voltage sag forecasting. The study proved that the FTS algorithm fits power events, especially in the case of the non-ferrous metal industry. Research reported in [
20] used a decision tree to predict the following power parameters: the disturbances considered as Earth faults, rapid voltage changes, and voltage dips. Another study [
21] presented a real-time classification system for detection of power quality events, such as transient, sag, swell, interruption, or flicker events. The proposed identification and classification of power quality disruptions were based on machine learning and a hybrid deep learning approach. The authors compared the results of the XGBR, CatBoostR, LGBMR, and LSTM models with the those of the proposed Boosting CNN SOS. Yet another article [
22] presented a more in-depth analysis of classification compared to the previously discussed publication. Here, the authors proposed a method for classifying power quality disturbance signals based on Segmented and Modified S-Transform (SMST), Multiclass Support Vector Machine (MSVM), and Deep Convolutional Neural Network (DCNN) models. This method uses frequency segmentation with various adjustable parameters as a function of a Gaussian window. The developed method made it possible to achieve accurate and effective extraction of features of various disturbances. The results show that the proposed method outperforms several state-of-the-art algorithms in classifying power quality disturbances at different noise levels. The authors of [
23] proposed two different approaches for supervising selected electrical disturbances in low-voltage networks, such as voltage notch, voltage sag, voltage swell, harmonic disturbances, and interruption. The two approaches correspond to the classification of the frequency-domain voltage signals using machine learning techniques. The first technique uses Fourier transform (FT) to classify the corresponding disturbance classes through a Multilayer Neural Network with Multivalued Neurons (MLMVN).
The second method allows for the use of a Convolutional Neural Network (CNN) and Short-Time Fourier Transform (STFT) with each layer of 2D convolutions for dimensionality reduction and feature extraction. It should be mentioned that both of the aforementioned methods are characterized by high effectiveness. In [
24], a model designed to forecast and classify the power quality disturbances was presented.
The latest encoder–decoder model was used as a forecasting system, with a hybrid convolutional neural network–long short-term memory (LSTM) model as the classifier model. Quantile Regression Averaging (QRA) was applied for short-term nodal voltage forecasting in [
25]. The model was compared with three others, and the results confirmed that QRA achieves better performance than the other models. A forecasting method based on a Long short-term memory network for the forecast of power voltage and current was proposed in [
26]. The system is not needed for the data processing stage, including feature extraction or selection, and the system was found to be robust and achieved good results for voltage and current forecasting. In [
27], a deep learning method long short-term memory was investigated for voltage harmonics prediction in a wind turbine. The model was constructed in two steps: feature extraction using window segmentation and LSTM for forecasting. The authors concluded that the designed model is effective for forecasting voltage harmonics in power systems. In [
28], a multilayer perceptron neural network (MLPNN) was applied for the forecasting of the harmonic distortion of current (
) in a system where photovoltaic cells were used. In such a solution, six different models of MLPNN with varying numbers of hidden layers and input parameters were tested. In general, all designed models achieved good forecasting accuracy. In [
29], a long short-term memory model was presented for very short-term power frequency forecasting. The model was trainedusing four input variables: the previous frequency and power load, the day of the week, and the hour of the day. The study demonstrated the effectiveness of the designed methodology. A hidden Markov model with weather conditions was used to forecast disturbance events in power quality (PQ) in [
30]. The system achieved better forecast accuracy compared to other traditional forecasting techniques. A similar solution was used by the authors of [
31], where a hidden Markov model used the numerical weather prediction (NWP) in order to forecast the PV power production. In another solution for a microgrid purpose, an approach based on an artificial neural network was proposed to estimate the power voltage and total harmonic distortion (THD) [
32].
The system was tested for four different cases, and the results proved that the model estimated the voltage and the THD worked successfully. In [
33], machine learning methods were used to predict the following parameters: voltage dips, ground faults, rapid voltage changes, and interruptions. The best forecasting accuracy was achieved when using the random forest model, which achieved better performance than the other models. In [
34], a short-term forecasting system based on machine learning techniques was proposed. The model was designed to predict voltage, frequency, and harmonic distortions. Four models were compared: XGBoost Regressor, two dense neural network models, and LSTM.
In [
35], a VMD-XGBTCN method for power voltage prediction was proposed. This method was constructed using variable modal decomposition (VMD) for voltage signal decomposition. Then, feature selection was applied using Extreme Gradient Boosting (XGBoost) and a Temporal Convolutional Network (TCN) for voltage prediction. The designed system exhibited a slight error in voltage forecasting compared to the others, but it also offered better forecasting performance. Voltage instability prediction in a power system based on a Recurrent Neural Network (RNN) trained by Particle Swarm Optimization (PSO) was proposed in [
36]. The test results proved the validity of the model. Researchers applied the Grey Wolf Optimizer with the Least Square Support Vector (GWO-LSSVM) to predict the Total Harmonic Distortion (THD) [
37]. The performance of the studied model was compared with that of two other forecasting systems: the Standard Least Square Support Vector Machine (LSSVM) and Particle Swarm Optimization. A nonlinear autoregressive network was applied, combined with the Least Square Support Vector Machine (PSO-LSSVM). The forecasting accuracy of the designed model exceeds that of other compared models. The authors of [
38] presented neuro-fuzzy modeling, which was applied for current total harmonic distortion (
) prediction appearing in the medium voltage range. The results of the designed system can be used for the filtering of the
in the power supply.
The authors of [
39] applied a nonlinear autoregressive network for the forecasting of the total harmonic distortion of voltage and current. The system was tested for three cases of nonlinear loads in three-phase networks. The results were evaluated and compared with the results of two other neural networks.
In [
40], two algorithms, namely correlation kernel regression and the autoregressive moving average models, were used to forecast the frequency of signals, and the system was tested on three power grids. The experimental outcome proved the efficiency of the suggested models in voltage signal forecasting. Combining an Improved System Frequency Response (ISFR) model with a Long Short-Term Memory (LSTM) network for power frequency prediction was suggested in [
41]. ISFR was used to generate the features; then, these features were fed to an LSTM to fit a relationship between the input features and the frequency response. The system was tested for the IEEE 39-bus system; the experimental results demonstrated that the designed model achieves better forecasting results than traditional systems. A neural network with a maximum information coefficient was applied in [
42] for power frequency forecasting. The maximum information coefficient was used to extract features from the factors relevant to the power frequency; then, the neural network was used to predict the frequency. The model was validated using a historical power-grid dataset, and the results confirmed the prediction accuracy of the designed example.
Another study repored in [
43] used an LSTM recurrent neural network for prediction of nonlinear power voltage. The research concluded that the system can fit the power voltage perfectly and achieves improved effectiveness and forecasting accuracy. A Generalized Regression Neural Network (GRNN), optimized by the Mind Evolution Algorithm (MEA) was tested in [
44] to predict
caused by LED lamps. The AdaBoost algorithm was used to combine several MEA-GRNN singles to improve the model’s prediction accuracy. The effectiveness of the model was compared with that of a BP neural network and a GRNN, and the prediction accuracy of the designed model reached 95.48%. In China, a study tested a hybrid model for short-term power load forecasting [
45]. The system was built using adaptive mode decomposition and an improved least squares support vector machine. The simulation results confirmed the system’s effectiveness when compared with other existing models.
It is worth mentioning that KNN regression is a non-parametric technique in machine learning. This algorithm functions by calculating the distance between data points and selecting the k data points with the shortest distance, and the final output is the average of their outputs. The BGDT regression tree, which divides the original dataset into subsets and is used for the training of multiple parallel trees, consists of multiple trees that are learned sequentially, whereas KNN, BGDT, and BOST can be applied for classification and regression applications and, in this study, were applied for the forecasting of PQPs.
The main novelty of this article is to design a forecasting model in a small-scale off-grid environment and to compare its performance with another existing study using different types of input parameters to find a better approach for forecasting power parameters. The proposed model uses the first and second back steps of each output with input variables. This idea reflects the improvement in forecasting results compared with other existing studies for the same dataset. The following input variables are used in this experiment: air temperature, wind speed, air pressure, solar irradiance, home appliances (AC heating, lights, fridge, and TV), and two back steps of each output parameter.
The forecasted parameters are U, (), (), (), and (). By analyzing this data, we can optimize the quality of the generated power by changing the schedule of the home appliances in the input variables of the forecasting model, resulting in changes in PQP forecasts, which determine the quality of the generated power.
Forecasting power quality parameters, especially in an off-grid system, is an important issue that can help to optimize power quality. Furthermore, power quality forecasting systems are a main part of an intelligent control system, which is needed to operate a power microgrid—nowadays considered a key point of energy communities.
The main novelties of this article are presented in the following points:
The PQP forecasting accuracies of four tested models are compared.
The computation times for PQP forecasting execution using the proposed methods are compared.
Using two back steps of each PQP with input variables improved the forecasting results.
Using home appliances with input variables permitted power quality optimization, as not previously reported in [
8] or other existing studies. Moreover, the forecasting results were compared with those obtained in [
8], with the same dataset used in both studies.
The main goal of this study is to test and compare four PQP forecasting methods: Bagging Decision Tree (BGDT), Boosting Decision Tree (BODT), and the K-Nearest Neighbors algorithm (with and ).
This article is structured as follows:
Section 1 introduces the issue and lists previous related studies. This section further describes the innovation, the authors’ motivation, and the main objective of the article.
Section 2 describes the hardware of the off-grid system used to measure the dataset.
Section 3 presents the theoretical framework underlying the prediction method applied in this study.
Section 4 introduces the proposed methodology.
Section 5 reports the experimental results, while
Section 6 provides a discussion. Finally,
Section 7 summarizes the conclusions of the work.
4. Proposed Model
Based on the results achieved in our last study [
14] that tested seven models for PQP forecasting, we concluded that the ensemble decision tree achieved better results than other compared models. Therefore, in this work, the ensemble tree was chosen as a comparison forecasting model to predict the
,
,
,
, and
U parameters. The selected input variables are global solar irradiance, air temperature, wind speed, air pressure, UV, home appliances, and one and two back steps of each target output. The output PQPs are
U,
,
, (
), and (
), as can be seen in the experimental scheme in
Figure 7 and the training and testing procedure in
Figure 8.
The Root Mean Square Error (RMSE) [
55] was used to analyze and compare the forecast results, as in (
3).
where
is the measured value,
is the output of the model, and
n is the number of data samples.
Forecasting models were designed and results were compared using MATLAB in the following steps:
Reading, uploading the dataset, and selecting input variables;
Adding the first and second back steps for each target output into the input variables;
Dividing the dataset into training and testing datasets;
Training and testing forecasting models (BADT, BODT, KNN (k= 5), and KNN (k = 10)) with each model setup for prediction of five PQPs;
Calculating the forecasting errors, correlation coefficient, and execution time;
Plotting the results of the designed models.
The experiments were conducted on a laptop with the following hardware specifications: Intel(R) Pentium(R) 5405U CPU @ 2.30 GHz and 4.00 GB of installed RAM (3.88 GB usable). The software environment consisted of the Windows 11 Home operating system, version 21H2, and MATLAB R2018a as the programming language platform. Although the hardware can be considered relatively outdated by current standards, the experimental procedures did not require substantial computational resources. Hence, the available configuration was sufficient to reliably perform all simulations and data analyses.
4.1. Dataset
The dataset used in this experiment is provided by the off-grid system installed at the ENET Centre, as can be seen in
Figure 1, and the dataset is available for free download online [
56]. The dataset consists of weather conditions (such as air temperature, wind speed, air pressure, ultraviolet radiation, etc.); the
U,
,
,
, and
parameters; and consumed loads for four types of home appliances (AC/heating, lights, fridge, and TV).
4.2. Experimental Description
The experiments tested the following models to forecast the mentioned PQPs: BGDT, BODT, and KNN (with and ). The power parameters are U, , , , and . The designed models were created using the following input variables: air temperature (), wind speed (), air pressure (), ultraviolet radiation (), solar irradiance (), power consumed by home appliances (AC/heating, lights, fridge, and TV), and two steps of each power quality parameter.
For example, in the voltage forecasting model at time t, we utilized the following variables: , , , , , , , , , , and . The measured voltage () at the same time step was used as the target output, applying the same procedure for the rest of the forecasting models. The algorithm was applied for different numbers of k. The best results were obtained when setting k to 5 and 10; therefore, models with and were selected for comparison in this study.
7. Conclusions
In this study, four forecasting models were designed and tested to predict the following PQPs: U, , , , and . The models were tested using a dataset provided by the Centrum ENET for ten days from 29 June to 8 July 2019. The weather conditions, power consumption of home appliances, and two back steps from each parameter were used as the input variables. The RMSE criteria were used to compare the results of the tested models. The experiments evaluated the forecasting results and execution time of the compared models.
The results of the study can be summarized as follows:
The best U forecasting restuls were obtained by KNN (), with a forecasting error of approximately 0.35.
BGDT obtained the lowest forecasting error for both and —about 0.056 and 0.61, respectively.
The bestforecasting results for were accomplished by BGDT, KNN (), and KNN (), all with a forecasting error close to 0.011.
BGDT and BODT obtained the best results for forecasting.
The shortest computation time required to forecast five PQPs was 14 s, obtained by KNN () and (), followed by BGDT, with the greatest computation time of about 17.12 obtained by BODT.
In this study, we used two back steps of each PQP output with input variables. This method improved the forecasting results compared with those obtained in other studies conducted using the same dataset. This study dealt with the forecasting of PQPs based on weather and home appliance data. Using the status of home appliances with input variables enables rescheduling of the power load to ensure balance between demanded power and generated power from renewable power sources.
One of this study’s principal contributions is incorporating back steps (first and second steps back) of the output parameters into the model input variables. This approach has proven to be an effective method in terms of enhancing forecasting accuracy, as it allows the model to exploit the historical dynamics of the observed indicators, thus, better capture temporal dependencies in the data. Compared to models relying solely on current meteorological conditions or instantaneous consumption, this methodology significantly improves the robustness and reliability of the forecasting results.
A second significant contribution is the integration of information on the operational status of household appliances into the input variables. This factor has not commonly been considered in comparable studies, despite its direct impact on power quality in off-grid systems. Applying appliance status improves the accuracy of power quality forecasts and creates opportunities for power quality optimization through intelligent appliance scheduling. The predicted values of power quality parameters can serve as inputs for decision-making mechanisms that dynamically shift appliance operation to time intervals with more favorable generation and distribution conditions.
This approach contributes to a better balance between renewable energy generation and instantaneous demand, thereby increasing the overall stability and reliability of off-grid microgrids. At the same time, it reduces the risk of overloading or stressing specific system components and helps prevent negative impacts on end-user devices. Therefore, the findings of this work demonstrate that the inclusion of back steps of output parameters, together with appliance status, represents a significant advancement in the development of forecasting and optimization methods. This can serve as the foundation for intelligent control systems in community energy solutions and energy self-sufficient households.
The limitations of this study should be acknowledged. The dataset was obtained only from a single microgrid household platform in the Czech Republic. It covered ten consecutive summer days, which restricts the generalizability of the results to other geographical regions, seasons, or longer operational horizons. The analysis was also limited to a subset of power quality parameters (U, , , , and ), while other essential indicators, such as flicker, voltage dips, swells, or transient disturbances, were not considered. Furthermore, only four forecasting methods (BGDT, BODT, and KNN with and ) were tested, without extensive hyperparameter optimization, and modern deep learning approaches widely applied in time-series forecasting (e.g., LSTM, GRU, CNN-LSTM, or transformers) were not included. The experiments were conducted on modest computational hardware that was sufficient for the current dataset but not necessarily representative of larger-scale applications. Finally, although forecasting accuracy and execution time were evaluated, the practical integration of the models into real-time microgrid management systems, such as automated appliance scheduling or economic optimization applications, was not investigated.
Future research should extend the dataset to different seasons and locations, include a broader set of PQPs, explore advanced machine learning methods with thorough optimization, and focus on real-time applicability within innovative microgrid control systems.