1. Introduction
The accelerating threat of climate change has intensified the global imperative to decarbonize energy systems by replacing fossil fuels with renewable sources [
1]. Solar, wind, and bioenergy are recognized as critical elements of this transition, promising virtually unlimited clean energy [
2,
3]. Algeria, for example, has committed to an ambitious long-term decarbonization roadmap and stands out for its enormous solar potential [
4]. Over 85% of Algeria’s territory lies in the sun-drenched Sahara and Sahel [
5], receiving more than 3000 h of sunshine per year [
6]. These conditions translate to exceptionally high solar irradiance on the order of 1850–2100 kWh/m
2 annually across much of the country, vastly exceeding global averages. In this context, grid-connected photovoltaic (PV) systems are strategically critical: they can exploit abundant insolation and support Algeria’s energy transition goals by generating carbon-free electricity at utility scale, as recently demonstrated on a multi-MW plant operating in the Saharan climate [
7].
Most photovoltaic systems operate under maximum power point tracking (MPPT) control, which enhances overall energy conversion efficiency by continuously adjusting the operating point so that the array delivers its maximum available power [
8,
9]. Nevertheless, the inherently intermittent nature of solar energy introduces operational challenges when MPPT is applied in grid-connected conditions, including reverse power flow and reduced system inertia, both of which can adversely affect grid stability [
10,
11]. In accordance with grid code requirements, photovoltaic systems are, therefore, expected to adopt FPP tracking (FPPT) strategies during fault ride-through events or frequency deviations [
12]. Under such conditions, FPPT intentionally limits the active power output of the PV system, thereby creating headroom for the provision of ancillary services [
12,
13]. The FPPT approach maintains a prescribed power at a set point level by temporarily shifting the operating point away from the maximum power point in response to changing environmental, grid conditions, or load demand [
9,
14]. The need for such curtailed-power operation is reinforced by the broader trend toward hybrid PV systems coupled with storage and passive thermal management, where the available DC power must be continuously matched to a system-level reference rather than blindly maximized [
15,
16]. The primary objective of FPPT is to regulate the PV output to a predefined power reference that satisfies grid operational constraints [
14]. For this reason, FPPT is also referred to in the literature as constant power generation (CPG) control [
8,
9]. To date, a wide range of CPG strategies has been reported, which may be broadly classified into linear search methods, nonlinear search techniques, model predictive approaches, and artificial intelligence-based algorithms [
12,
17]. Despite the notable advantages of the PV systems, PV power is fundamentally variable and nonlinear; this nonlinearity is rooted in the underlying device physics and parameter dependence of the cell I–V characteristic, as quantified by recent parameter-extraction studies [
18,
19]. Solar generation inherently follows day-night and weather cycles, making it intermittent and stochastic [
10,
20]. Clouds, shading, and temperature changes cause rapid fluctuations in output, and even the clear-sky baseline varies with sun angle and atmospheric conditions [
21,
22]. This variability complicates the matching of supply and demand: sudden drops or surges in PV output can challenge grid balance and scheduling [
17,
23]. Photovoltaic power production is inherently intermittent and exhibits pronounced temporal variability, primarily as a consequence of changing climatic conditions and the diurnal cycle. This intrinsic variability introduces significant challenges for power system scheduling and operational planning, particularly in grids with a high penetration of solar energy [
8]. Consequently, accurate prediction of PV production has emerged as a critical requirement [
10,
24,
25]. Modern studies emphasize that real-time prediction of PV output is vital for reliable grid operations: by anticipating fluctuations, system operators can arrange dispatch, reserves, and storage to maintain stability [
4,
21]. In other words, high-fidelity short-term PV power forecasts are needed to integrate solar plants smoothly and to uphold power quality as renewables penetrate the grid [
4,
26]. Recent advancements have seen Machine Learning (ML) and Deep Learning (DL) architectures become the benchmark for handling these complex time-series forecasting tasks, both for predicting PV output and for the closely related task of data-driven fault diagnosis on the same monitoring streams [
27,
28,
29,
30]. These include convolutional networks, decision-tree ensembles, hybrid deep learning architectures, and neuro-fuzzy systems that combine adaptive learning with interpretable reasoning [
21,
31]. For example, recent hybrid models integrating temporal convolutional networks with attention mechanisms have achieved exceptionally high accuracy in short-term PV output prediction [
31]. A TCN–ECANet–GRU model yielded a coefficient of determination
of 0.9972 (99.72%) for short-term PV output prediction [
31], while optimized hybrid deep learning frameworks have reported comparable performance across multiple forecasting horizons [
27]. These results demonstrate that machine learning approaches are capable of representing the highly nonlinear behavior of photovoltaic systems when adequate training data are available [
4,
10,
32].
In practical applications, such models typically integrate multiple meteorological and electrical variables—including solar irradiance, ambient temperature, module temperature, load power, and grid power—and rely on advanced error metrics to rigorously assess predictive performance [
17,
30]. However, these high-performance ML solutions have notable limitations. First, they are often highly dependent on the quality and representativeness of the training data; models trained for a specific site or climatic regime may not transfer reliably to other locations, latitudes, or seasonal conditions without additional retraining [
32,
33]. Second, most ML techniques operate as black-box models, providing limited insight into how inputs affect outputs [
34]. This lack of transparency complicates model validation and restricts the integration of expert or physical knowledge into the forecasting process. Finally, even well-trained models can be brittle under extreme or unforeseen weather or unanticipated operating conditions. Abrupt weather events, atmospheric disturbances, or equipment anomalies that are insufficiently represented in the training dataset can significantly degrade prediction accuracy [
21]. Such challenges, namely limited generalization capability, data sensitivity, and vulnerability under atypical conditions, are consistently reported across the related literature [
8,
35].
For example, recent studies have highlighted that photovoltaic power output is inherently uncertain and subject to continuous fluctuations, a characteristic that significantly complicates energy yield forecasting [
10,
33]. Numerous works have consequently reported that complex machine learning models tend to experience a degradation in predictive accuracy when exposed to input uncertainty or operating conditions that deviate from those represented in the training data [
21,
28]. In sum, while deep learning architectures can achieve
scores above 0.99 under controlled conditions, there remains a need for methods that incorporate physical insight, adapt to sparse or noisy data, and explicitly handle uncertainty [
35].
A complementary body of work has benchmarked tree-based ensembles, recurrent architectures, and hybrid deep models for short- and medium-term PV power predicting across a wide range of plant scales. A recent comparative study by Kraska and Hanzel [
36] evaluated an XGBoost model against an LSTM network on four prosumer-scale Polish installations (25–50 kWp) and reported a clear advantage for the gradient-boosted learner (RMSE = 4.09 kW, MAE = 1.91 kW,
= 0.85, versus RMSE = 5.53 kW, MAE = 3.08 kW,
= 0.73 for LSTM), attributing the gap to the difficulty of training deep recurrent networks on the limited, single-year datasets typical of newly commissioned PV systems. Similar conclusions are reported in dedicated XGBoost–LSTM benchmarks for PV power forecasting [
37,
38] and in broader comparative analyses of LSTM, Random Forest, and XGBoost across solar and wind datasets [
39], all of which find that well-tuned tree-based learners frequently match or exceed deep recurrent baselines under limited data. Comprehensive reviews of solar PV forecasting [
40] further emphasize that the relative ranking of methods depends strongly on dataset size, forecast horizon, and feature engineering, while comparative studies on heterogeneous PV fleets confirm that carefully tuned classical machine learning often rivals deep architectures on tabular weather-driven data [
22,
41]. Conversely, hybrid deep models that combine temporal convolutions, attention mechanisms, and recurrent units—including physics-informed XGBoost–LSTM pipelines [
42] and CNN–LSTM–RF ensembles [
43]—have achieved very high accuracy (
) on utility-scale time series [
27,
28,
31,
44,
45], but they typically require multi-year datasets and substantial computational resources, which limits their deployment in real-time prosumer and edge-computing contexts [
34,
36,
46]. Equally consistent across these studies is the observation that forecast quality is fundamentally bounded by the fidelity of meteorological inputs, and that transition-season cloud variability, snow cover, and curtailment events remain dominant residual error sources [
21,
36].
Table 1 consolidates these observations along the dimensions of method, dataset, system scale, forecast horizon, input–output design, evaluation metrics, and reported limitations, in line with reviewer guidance, and positions the present work against this broader landscape.
Within this broader landscape, a coherent set of classical and deep-learning architectures has emerged as the de facto reference family for tabular, weather-driven PV power prediction, and these same architectures are used as the benchmark suite in the present study. Support Vector Machines (SVM) recast regression as a structural-risk-minimization problem and have repeatedly been ranked among the most accurate single-learner baselines for short-horizon PV forecasting when only a few months of training data are available [
47]. Decision Trees (DT) and their bagged extension, the Random Forest (RF), offer fully interpretable axis-aligned partitioning of the input space and have proven particularly well-suited to weather-driven solar problems where feature interactions are predominantly local; recent work has further documented the strong out-of-sample behavior of RF on irradiance regression tasks, including in mountainous and high-variability sites [
41,
48].
k-Nearest Neighbors (KNN) provides a non-parametric instance-based baseline that captures the local geometry of the input distribution and serves as a useful sanity check on more complex learners, especially when the underlying input–output map is smooth at the scale of typical neighborhoods [
47]. Finally, Deep Neural Networks (DNN) with rectified-linear units, dropout regularization and adaptive (Adam) optimization have become the dominant deep-learning baseline for PV power prediction in recent comprehensive reviews [
49], and were also benchmarked on the present dataset. Including all five paradigms—instance-based (KNN), kernel-based (SVM), tree-based (DT, RF) and deep-learning (DNN)—provides a fair, methodologically diverse comparison against which the proposed neuro-fuzzy estimator can be evaluated, and avoids the well-known risk of drawing conclusions from a single, narrowly-chosen baseline.
To address these challenges, an Adaptive Neuro-Fuzzy Inference System (ANFIS) is proposed for PV power prediction that uses an FPPT-based operating strategy. ANFIS is a hybrid modeling approach that combines the learning capability of neural networks with the human-like reasoning of fuzzy logic. Structurally, ANFIS implements a Sugeno-type fuzzy inference system in a multilayer neural network framework. During training, the network adapts both the fuzzy membership functions and the rule parameters using input–output data. In effect, ANFIS automatically extracts if–then rules from data, guided by expert intuition embedded in fuzzy sets, while leveraging gradient-based and least-squares optimization for parameter tuning.
The ANFIS approach is well-suited to modeling ill-defined, nonlinear, and stochastic processes like PV generation. First, its fuzzy logic component inherently handles uncertainty and imprecision by assigning degrees of membership rather than crisp labels, enabling robust reasoning under ambiguous and noisy environmental inputs [
34]. Second, the neural network aspect endows ANFIS with strong nonlinear modeling power. By blending these paradigms, ANFIS retains the flexibility of ML while remaining interpretable: the learned fuzzy rules can be inspected and understood, rather than buried in millions of network weights [
50]. Recent studies have demonstrated that ANFIS-based models achieve a high level of accuracy and precision while maintaining relatively low computational complexity [
51]. Importantly, their rule-based structure preserves a degree of interpretability that is often absent in purely data-driven approaches, which makes ANFIS particularly well-suited for real-time prediction and control applications in photovoltaic systems. In practical terms, ANFIS can capture the nonlinear I–V characteristics and dynamic effects of PV arrays, while transparently reflecting how changes in irradiance or temperature affect output [
26]. This combination of interpretability and uncertainty management motivates the choice of ANFIS over purely black-box methods [
50].
In the studied plant, the five inputs and the target are measured synchronously at the same 5-min sampling instant t. The model, therefore, performs same-instant estimation (predicting) of = rather than horizon-ahead predicting of . Three operational benefits motivate this formulation: (i) real-time inverter set-point validation against an independent data-driven estimate; (ii) sensor- and FPPT-controller-fault detection through residual analysis and (iii) a validated baseline that can later be extended to multi-step-ahead prediction by augmenting the input vector with lagged variables or numerical-weather-prediction outputs.
The remainder of this article is organized into six sections.
Section 2 details the experimental setup and monitoring infrastructure of the 117.76 kWp PV system.
Section 3 describes the data acquisition process, FPPT principles, and dataset statistics. The theoretical framework and mathematical formulations for both the ANFIS model and the benchmark algorithms are developed in
Section 4.
Section 5 presents the experimental design, covering data preprocessing, model configuration, and the three complementary validation strategies (S1 random, S2 chronological, and S3 external hold-out) utilized for robust evaluation.
Section 6 provides a comparative analysis of the model performances and discusses their operational implications. Finally,
Section 7 summarizes the key conclusions and outlines avenues for future work.
Objectives and Contributions
Although ANFIS has been applied to various photovoltaic predicting tasks, a critical examination of the existing literature reveals several unaddressed gaps that collectively motivate the present study.
Table 1 summarizes representative recent ANFIS-based PV studies and highlights the key dimensions along which they differ from the present work.
Table 1.
Comparison of representative ANFIS-based photovoltaic studies with the present work.
Table 1.
Comparison of representative ANFIS-based photovoltaic studies with the present work.
| Study | Method(s) | Dataset/Location | System Scale | Forecast horizon | Inputs | Output | Metrics | Limitations |
|---|
| Salameh et al. [52] | ANFIS, ANN | Sharjah, UAE (hot–humid) | 2.88 kW | Hourly | Env. only | PV power | RMSE, MAE, | Small scale; env. inputs only |
| Ispir et al. [53] | ANFIS, ANN, MLR | Türkiye (continental) | — | Daily/monthly | Meteo. | Solar radiation | RMSE, MAPE, | Resource study (no PV plant) |
| Annapoorani et al. [54] | ANFIS, ANN | India (tropical) | Small DC | Hourly | Env. only | Irradiance | RMSE, MAE, | DC test bench; no AC/grid context |
| Mohammed et al. [55] | ANFIS + PSO/GA | Simulation | Lab-scale | MPPT (real-time) | V–I
data | MPP power | Tracking eff., conv. time | Simulation only; MPP-only |
| Chicaiza et al. [56] | Fuzzy NN (digital twin) | Spain (Mediterranean) | 2.16 kW | Short-term | Env. only | PV power | RMSE, | Small scale; no grid/load input |
| Elboughdiri et al. [26] | ANFIS–GEP | Simulation | — | Hourly | Load + weather | Demand load | RMSE, MAPE | Demand-side only; no FPPT |
| Markovics & Mayer [22] | 24 ML methods + NWP | Hungary (temperate) | Multi-plant fleet | Day-ahead | NWP outputs | PV power | RMSE, nRMSE, MAE | Bounded by NWP errors |
| Cisse et al. [27] | CNN–BiLSTM (optimized) | Smart-grid time series | Utility-scale | 1–24 h | Meteo + hist. PV | PV power | RMSE, MAE, | Heavy model; multi-year data |
| Nguyen Trong et al. [28] | Hybrid DL + VMD | PV plant series | Utility-scale | 1–24 h | Meteo + hist. PV | PV power | RMSE, MAE, MAPE | Heavy preprocessing |
| Bouziane et al. [45] | CNN–RNN | Algeria | Medium-scale | Short-term | Env. + time | PV power | RMSE, | Limited generalization |
| Xiang et al. [31] | TCN–ECANet–GRU | Public PV series | Utility-scale | Intra-day | Meteo + hist. PV | PV power | , RMSE | Very deep; very large data |
| Oprea & Bâra [23] | Stacked ensemble | Romania | Utility-scale | Day-ahead | Meteo + hist. PV | PV power | RMSE, MAE, nMAE | High overhead |
| Rodriguez-Leguizamon et al. [37] | XGBoost vs. LSTM | Colombia | Utility-scale | Short-term | Meteo + hist. PV | PV power | RMSE, MAE, | Single-site validation |
| Cortez et al. [38] | ARIMA/LSTM/XGBoost | Brazil | Utility-scale | Intra-hour | Meteo + hist. PV | PV power | RMSE, MAE, MAPE | Very narrow horizon |
| Bin Yousuf et al. [42] | Physics-inf. XGB–LSTM | Public PV datasets | Utility-scale | Short-term | Meteo + physical | PV + uncertainty | RMSE, MAE, CRPS | High complexity |
| Kraska & Hanzel [36] | XGBoost vs. LSTM | Poland, 4 sites (continental) | 25.55–49.5 kWp | Day-ahead (24 h) | NWP + lag + cyclical | PV power (kW) | XGB: 4.09/1.91/0.85; LSTM: 5.53/3.08/0.73 | Snow cover, transitions, curtailments |
| Present work | ANFIS vs. SVM, DT, RF, KNN, DNN | NW Algeria (hot semi-arid) | 117.76 kWp | 5-min | | (kW) | RMSE, MAE, MAPE, | Single-site; Prediction only |
Three principal gaps emerge from this analysis. First, all prior ANFIS-based PV studies target either maximum power point (MPP) output, total PV power, or solar radiation; none addresses the prediction of the FPP that arises under FPPT-governed operation. As modern grid codes increasingly require PV plants to curtail output and provide ancillary services [
12,
13], the ability to accurately forecast the FPP—rather than merely the MPP—becomes operationally essential yet remains unexplored in the neuro-fuzzy literature. Second, existing ANFIS models for PV predicting rely almost exclusively on environmental inputs (irradiance, temperature, humidity), neglecting the electrical system-level variables (grid power, load power) that govern the real-time energy balance of a grid-connected facility. Incorporating these variables enables the model to reflect the instantaneous interaction between PV generation, facility demand, and grid exchange, which is critical for building-integrated energy management but has not been attempted in prior ANFIS studies. Third, the overwhelming majority of ANFIS-based PV studies are conducted on small-scale systems (typically < 5 kW) under controlled or simulation conditions, and in climatic zones that do not represent the hot, semi-arid environments characteristic of North Africa. Validation on a real, fully instrumented utility-scale plant operating in such conditions is, therefore, lacking.
To bridge these gaps, this work implements and evaluates an ANFIS-based predicting model on a utility-scale, grid-connected 117.76 kWp rooftop PV plant located in northwestern Algeria, as illustrated in
Figure 1. The system operates under a zero-export, self-consumption strategy with dynamic FPPT control. Using real sensor data collected from the plant’s supervisory monitoring infrastructure, the ANFIS model is trained to predict the system’s FPP as a function of five simultaneously measured input variables: solar irradiance (
G), ambient temperature (
), module temperature (
), grid power (
), and load power (
). The specific contributions of this study are as follows:
Novel prediction target under FPPT control, this is the first study to apply ANFIS or any neuro-fuzzy architecture to the prediction of the FPP in a grid-connected PV system operating under a dynamic FPPT strategy. Unlike conventional MPP-oriented predicting, this formulation directly supports grid-compliant active power regulation and ancillary service provision.
Joint environmental electrical input framework. The proposed model uniquely combines environmental variables (irradiance, ambient and module temperatures) with electrical system-level variables (grid power and load power) as simultaneous inputs, embedding the facility’s real-time energy balance into the fuzzy inference process. This dual-domain input design enables the ANFIS to capture not only weather-driven PV variability but also demand-side dynamics, yielding a physically grounded and operationally relevant predictive model.
Real-world validation at utility scale in an under-represented climate. The model is trained and validated on 24,479 field measurements acquired from a fully instrumented 117.76 kWp rooftop PV installation serving an industrial facility in the hot, semi-arid climate of northwestern Algeria—a system scale and geographic context that are substantially under-represented in the existing ANFIS-based PV predicting literature, which is dominated by small-scale systems (<5 kW) in temperate or tropical regions.
Comprehensive and fair multi-model benchmarking. The ANFIS model is rigorously compared against five diverse machine learning paradigms—Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF), k-Nearest Neighbors (KNN), and Deep Neural Network (DNN)—all trained and evaluated on identical data splits with systematically tuned hyperparameters. This breadth of comparison, spanning instance-based, tree-based, kernel-based, ensemble, and deep learning approaches, provides a robust and unbiased assessment of ANFIS’s relative merits for PV power prediction.
The proposed approach is further distinguished from prior ANFIS-based PV studies along several methodological dimensions. Whereas Salameh et al. [
52] and Annapoorani et al. [
54] employ ANFIS with purely environmental inputs on small residential-scale systems (≤3 kW) to predict total PV output, the present work operates at a fundamentally different system scale (117.76 kWp) and targets a different physical quantity (FPP under FPPT control). Similarly, while Chicaiza et al. [
56] use a fuzzy neural network as part of a digital twin framework for a 2.16 kW PV facility, their model does not incorporate grid or load power and does not address flexible power regulation. Mohammed et al. [
55] apply ANFIS in the context of partial shading optimization rather than time-series power forecasting. By contrast, the present study simultaneously addresses a practically relevant but unexplored prediction target (FPP), introduces a novel dual-domain (environmental + electrical) input architecture, and validates the approach on a real utility-scale installation under harsh climatic conditions—thereby extending the scope and applicability of ANFIS-based photovoltaic modeling beyond what has been previously demonstrated.
Figure 1.
Studied PV System architecture: 117.76 kWp rooftop PV plant (zero-export self-consumption configuration).
Figure 1.
Studied PV System architecture: 117.76 kWp rooftop PV plant (zero-export self-consumption configuration).
3. Acquisition and Setup of Operational Data
The power control loop maintains continuous synchronization with the utility grid while governing the operation of the power conversion stage, with particular emphasis on the DC–AC inverter. This synchronization plays a critical role in ensuring high power quality, stabilizing voltage and frequency levels, and reducing the likelihood of unintended grid disconnections.
Within this framework, the photovoltaic system implements a dynamic FPP tracking (FPPT) approach that continuously determines the FPP
, even in the presence of changing environmental and operating conditions such as variations in solar irradiance, module temperature, and load demand. Through adaptive adjustment of the operating voltage, the FPPT enables the system to accurately track the desired power level while sustaining efficient operation, including during partial shading or rapid transients.
Figure 2 illustrates the PV power, the grid power and the load demand graphs during one day.
The identification of the FPP is primarily determined by the available maximum power, , and the instantaneous load demand. Accordingly, the system dynamically regulates the output current and voltage and, by extension, the generated power in response to these parameters, as described by the governing equations presented below.
The current at the maximum power point,
, varies proportionally with the incident irradiance and is further affected by temperature-induced deviations. This behavior can be represented as follows:
where
denotes the reference current under standard test conditions (STC),
is the irradiance at STC,
is the temperature coefficient of current, and
and
represent the module temperature and the STC reference temperature, respectively.
The reference current
itself originates from the fundamental photovoltaic cell model, which describes the equilibrium between the photo-generated current and the diode reverse saturation current. This relationship, accounting for temperature effects, is expressed as follows:
where
is the light-generated current,
is the diode saturation current,
q denotes the elementary electric charge,
k is the Boltzmann constant, and
n is the diode ideality factor, which characterizes the non-ideal behavior of the semiconductor junction.
In a similar manner, the voltage at the maximum power point,
, exhibits a strong dependence on temperature and can be approximated using the voltage temperature coefficient
as follows:
The reference voltage
can be derived from the current–voltage characteristics of the photovoltaic cell and is given by the following:
This formulation stems from the exponential relationship governing semiconductor junctions, where the logarithmic term reflects the voltage developed as a function of generated current and operating temperature.
Finally, the electrical power at the maximum power point,
, is influenced by temperature-induced effects and can be approximated as follows:
where
represents the rated power under standard test conditions and
denotes the power temperature coefficient. Collectively, these expressions derived from the intrinsic operating principles of photovoltaic cells enable maximum power point tracking (MPPT) algorithms to dynamically identify and follow the optimal operating point under rapidly changing environmental conditions, thereby ensuring efficient and reliable PV system performance.
The control strategy of the studied system employs an FPPT approach to regulate the photovoltaic system’s output power in accordance with the load demand. When the required power exceeds the maximum power point (MPP) that can be supplied by the PV array, the resulting power deficit is automatically compensated by drawing the additional required energy from the utility grid, as illustrated in
Figure 2. This operating principle ensures continuous supply and power balance, and the corresponding relationship is described by the following equation:
Data Analysis and Interpretation
The dataset examined in this study was collected directly from the system described in the preceding section. As previously described, the system is equipped with an integrated monitoring infrastructure that simultaneously records environmental and electrical variables under real operating conditions. All variables are logged at regular intervals through a centralized data acquisition unit, ensuring temporal alignment and consistency across the dataset. The monitoring campaign spans a representative operating period characterized by pronounced variability in irradiance and temperature, thereby capturing a wide range of system states driven by stochastic environmental variables. Such coverage is essential for investigating performance trends and training data-driven models that must remain robust under non-stationary conditions. Continuous logging, combined with redundant communication links, minimizes data loss and preserves measurement fidelity.
The selection of input features was informed by photovoltaic operating principles and exploratory statistical analysis. Solar irradiance, module temperature, ambient temperature, grid power, and load power were retained due to their direct influence on carrier generation, heat transfer, and semiconductor junction dynamics within the PV modules. These variables govern the current–voltage characteristics of the array and, by extension, its power output.
Table 3 summarizes the statistical characteristics of the electrical and environmental variables recorded over 24,479 valid samples, providing a quantitative overview of the operating conditions of the photovoltaic system and its interaction with the grid and load.
The PV power output exhibits pronounced variability, with a mean value of 17.38 kW and a standard deviation exceeding 23.7 kW. This wide dispersion reflects the intermittent nature of solar generation, further evidenced by a median of only 2.52 kW and a first quartile equal to zero, indicating extended periods of negligible or no production, particularly during low-irradiance conditions. Nevertheless, peak generation reaches 106.3 kW, approaching the nominal capacity of the installation and confirming proper operation under favorable conditions.
Grid power shows a substantially higher mean of 33.49 kW and a maximum value of 380.75 kW, highlighting the grid’s dominant role in balancing demand when PV generation is insufficient. The load power profile follows a similar pattern, with an average demand of 50.68 kW and a maximum exceeding 420 kW. The large standard deviation of load power (63.86 kW) indicates significant fluctuations in consumption, necessitating continuous power exchange with the grid to maintain supply-demand equilibrium.
From an environmental perspective, ambient temperature remains relatively stable, with a mean of 24.6 °C and moderate dispersion, while module temperature shows a much wider range, varying from 9.5 °C to 69.1 °C. This spread underscores the strong thermal stress experienced by the PV modules under high irradiance conditions. Solar irradiance itself ranges from 0 to 1206 W/m2, with a mean of 284 W/m2 and a median of 95 W/m2, confirming that a substantial fraction of the dataset corresponds to low or zero irradiance periods.
Figure 3 presents the Pearson correlation matrix for the recorded variables. The strongest correlations appear between load and grid power (
) and between irradiance and module temperature (
), which is consistent with the strong coupling between demand, power exchange with the grid, and the thermal response of the PV field. PV power is also highly correlated with irradiance (
) and module temperature (
), confirming that the pre-processing steps preserved the expected physical relationships between generation and environmental conditions. Ambient temperature shows only moderate correlations with the electrical variables, suggesting a secondary but still relevant influence on system behavior. Altogether, this heatmap provides an intuitive overview of the main dependencies in the dataset and supports the joint use of these variables in the subsequent modeling and analysis.
7. Conclusions
This study develops and validates an Adaptive Neuro-Fuzzy Inference System (ANFIS) for predicting the FPP in a 117.76 kWp grid-connected rooftop photovoltaic plant operating under a zero-export strategy in northwestern Algeria. Using only 32 learnable parameters organized into two fuzzy rules, the model achieves and RMSE values between 325 W and 654 W on all three validation strategies considered: random day-based (S1), strictly chronological (S2), and an external 14-day hold-out (S3). To the best of the authors’ knowledge, this is the first application of neuro-fuzzy methods to FPPT-governed PV systems, where curtailment and ancillary-service provision are increasingly required by modern grid codes.
When benchmarked against five established machine learning paradigms—Support Vector Machine, Decision Tree, Random Forest, k-Nearest Neighbors, and Deep Neural Network—ANFIS is the only model that maintains sub-700 W RMSE on every split, while every benchmark degrades by a factor of 1.5–2.0× when the evaluation protocol shifts from random S1 to chronological S2 or external S3. Most importantly, ANFIS achieves RMSE values of 363 W and 408 W on the external 14-day hold-out, well below the training RMSE of ≈1179 W on data the model never accessed during training, tuning, or hyperparameter selection. This result definitively rules out data leakage as a source of the reported accuracy.
The trained model reveals a fundamentally near-linear FPP–input relationship, as confirmed by an Ordinary Least Squares baseline that attains with only six parameters. ANFIS, therefore, operates as a compact piecewise-linear regressor whose 32 parameters can be directly inspected and verified against photovoltaic physics, making it well suited for integration into supervisory control and grid-compliance frameworks where transparency is a prerequisite for operational deployment.
Four limitations bound the scope of these findings. First, temporal coverage is restricted to a single three-month summer period (May–July 2025; 85 retained calendar days), and robustness under winter and inter-annual conditions remains untested. Second, geographic scope is limited to a single site, so transferability to other climates, array configurations, and FPPT modes requires dedicated experimental validation. Third, the model has not yet been demonstrated in closed-loop control, where predictions drive real-time curtailment decisions. Fourth, formal cross-model statistical hypothesis tests (e.g., Friedman–Nemenyi) and multi-seed aggregation would further strengthen the comparative claims, although the convergence of S1, S2, S3 and five-fold cross-validation results around the same numerical envelope already provides multi-criterion evidence of stability.
Future work will prioritize extending data collection to a full annual cycle, validating the model across multiple sites and climatic zones, integrating ANFIS-driven set-point tracking with the facility’s SCADA system, embedding physics-informed constraints into the training loss, and exploring the learned rule structure as a basis for residual-driven fault detection and predictive maintenance. The convergence of high predictive accuracy, operational interpretability, and computational efficiency demonstrated here positions neuro-fuzzy inference systems as a strong candidate technology for next-generation smart energy management in grid-connected industrial and institutional buildings operating under modern grid-code requirements.