A State-of-Art-Review on Machine-Learning Based Methods for PV

: In the current era, Artiﬁcial Intelligence (AI) is becoming increasingly pervasive with applications in several applicative ﬁelds effectively changing our daily life. In this scenario, machine learning (ML), a subset of AI techniques, provides machines with the ability to programmatically learn from data to model a system while adapting to new situations as they learn more by data they are ingesting (on-line training). During the last several years, many papers have been published concerning ML applications in the ﬁeld of solar systems. This paper presents the state of the art ML models applied in solar energy’s forecasting ﬁeld i.e., for solar irradiance and power production forecasting (both point and interval or probabilistic forecasting), electricity price forecasting and energy demand forecasting. Other applications of ML into the photovoltaic (PV) ﬁeld taken into account are the modelling of PV modules, PV design parameter extraction, tracking the maximum power point (MPP), PV systems efﬁciency optimization, PV/Thermal (PV/T) and Concentrating PV (CPV) system design parameters’ optimization and efﬁciency improvement, anomaly detection and energy management of PV’s storage systems. While many review papers already exist in this regard, they are usually focused only on one speciﬁc topic, while in this paper are gathered all the most relevant applications of ML for solar systems in many different ﬁelds. The paper gives an overview of the most recent and promising applications of machine learning used in the ﬁeld of photovoltaic systems.


Introduction
ML is a subset of AI which is concerned with creating systems that learn or improve performance based on the data they use. The term machine learning was first used in 1959 by the American scientist Arthur Lee Samuel, with the following definition: "field of study that gives computers the ability to learn without being explicitly programmed".
Today, ML is ubiquitous. When we interact with banks, shop online or use social media, ML algorithms are used to make our experience efficient, easy and safe, along with learning our lifestyle-related preferences. For example, search engines on the Internet practically exploit them in many ways: the results we obtain derive from algorithms that elaborate models and patterns of use of search keys, as well as for completion suggestions. Amazon Go, the first store with no cashiers opened by Amazon in Seattle, is also based on ML and other advanced technologies. Self-driving cars, which we will soon see on the roads, use continuously improved ML models: MIT in Boston has developed a system that will allow these cars to orient themselves only with sensors and GPS, avoiding the use of maps which may simply be out of date or insufficiently detailed. ML is fundamental for data protection and fraud prevention, thanks to unsupervised algorithms that compare the access models and detect any anomalies, and it can also improve personal security, making checks at airports and places of transport more reliable and faster. Applications in the health sector will also be increasingly relevant, to obtain more accurate diagnoses, analyze • This is the first paper, as far as authors know, which gathers only more recent and promising, in authors' opinion, applications of ML in many different fields of PV and not only in a specific one, • For each of the fields under consideration a critical analysis is reported, highlighting the architecture/solution that, in literature, has proven to be the most suitable for that specific task, • The pros and cons of each solution are detailed, in addition to suggesting ideas for further investigation.
The remainder of this paper is structured as follows: Section 2 reports a reasoned introduction about ML methods or more generally data-driven methods, Section 3 gathers all more recent review papers on the topics treated in this paper, Section 4 is devoted to the field of PV power forecasting, Section 5 reports recent papers concerning the anomaly detection (fault diagnostic) in PV, Section 6 regards ML-based methods for MPPT in PV, Section 7 gives an overview on the other applications of ML in PV field and finally Section 8 ends the paper with concluding remarks and an analysis of possible future trends.

Machine Learning, Deep Learning and Related Methods
Nowadays, the term Artificial Intelligence is quite common and people, often even without knowing it, benefit from AI every single day: from Alexa (a ubiquitous application of a field of machine learning (ML) known as Natural Language Processing); to the recommendation system of Netflix, suggesting content for users to watch next using similar users' preferences; to the automated driving systems that equip many new recent car models. To better clarify the terms that are reported in many research papers, this section will briefly define the most common ones. AI indicates a branch of computer science that studies ways to build intelligent programs in a way that mimics human reasoning; the benchmark for AI is human intelligence regarding reasoning, speech, learning, vision and problem-solving. Bagging stands for Bootstrap Aggregation, where multiple models are trained in parallel, but each base model is trained on a different training set derived from the original training data using the Boostrap (data is randomly sampled from the original dataset with replacement) method and the final prediction is derived by a voting aggregation from the predictions of all base models. In bagging methods, the weak learners are usually of the same type. Since the random sampling with replacement creates independent and identically distributed samples, bagging does not change the models' biases but reduces their variance, producing a model capable of providing consistent results in production. A typical bagging model is based upon Random Forest. In boosting, multiple weak learners are learned sequentially, not in parallel as in bagging. Each subsequent model is trained by giving more importance to the data points that were misclassified (or giving greater error in terms of MSE for example) by the previous weak learner. In this way, the weak learners can focus on specific data points and can collectively reduce the bias of the prediction. In stacking, the base weak learners are trained in parallel as in bagging, but stacking does not carry out simple voting to aggregate the output of each weak learner to calculate the final prediction. Stacking employs another meta-learner to provide the final prediction, and this meta-learner is trained on the outputs of weak-learners to learn a mapping from the weak learners' output to the final prediction. Usually, this meta-learner is quite simple, such as a LASSO or Ridge regression.
Previously, the terms bias and variance have been cited a few times, and require further clarification. A common "mantra" in ML is the bias vs. variance trade-off; any ML-based model trying to improve bias will always make gains at the expense of variance, and vice versa. The two variables measure the effectiveness of the model: bias is the error or difference between real data and a models' predicted value, while variance is the error that occurs due to sensitivity to small changes in the training set.
Typically, the two terms are well synthesized with the image shown in Figure 1: [18][19][20]. The basic idea is quite simple: integrate a group of base models, also known as weak learners, to build up a more robust model. This robustness is intended to build a model capable of providing better accuracy, performing better, and/or being capable of better generalizing, i.e., to provide good performance for a "scenario" different from the training one. However, how does one train different weak learners and aggregate their output to build up a stronger leaner? In this regard many solutions are possible, but commonly used techniques are: 1. Bagging 2. Boosting 3. Stacking Bagging stands for Bootstrap Aggregation, where multiple models are trained in parallel, but each base model is trained on a different training set derived from the original training data using the Boostrap (data is randomly sampled from the original dataset with replacement) method and the final prediction is derived by a voting aggregation from the predictions of all base models. In bagging methods, the weak learners are usually of the same type. Since the random sampling with replacement creates independent and identically distributed samples, bagging does not change the models' biases but reduces their variance, producing a model capable of providing consistent results in production. A typical bagging model is based upon Random Forest. In boosting, multiple weak learners are learned sequentially, not in parallel as in bagging. Each subsequent model is trained by giving more importance to the data points that were misclassified (or giving greater error in terms of MSE for example) by the previous weak learner. In this way, the weak learners can focus on specific data points and can collectively reduce the bias of the prediction. In stacking, the base weak learners are trained in parallel as in bagging, but stacking does not carry out simple voting to aggregate the output of each weak learner to calculate the final prediction. Stacking employs another meta-learner to provide the final prediction, and this meta-learner is trained on the outputs of weak-learners to learn a mapping from the weak learners' output to the final prediction. Usually, this meta-learner is quite simple, such as a LASSO or Ridge regression.
Previously, the terms bias and variance have been cited a few times, and require further clarification. A common "mantra" in ML is the bias vs. variance trade-off; any MLbased model trying to improve bias will always make gains at the expense of variance, and vice versa. The two variables measure the effectiveness of the model: bias is the error or difference between real data and a models' predicted value, while variance is the error that occurs due to sensitivity to small changes in the training set.
Typically, the two terms are well synthesized with the image shown in Figure 1: The model's error is the difference between predicted and observed/actual values. Suppose one has a very accurate model: this means that the error is very low, indicating a low bias and low variance (as seen on the top-left circle in Figure 1). The model's error is the difference between predicted and observed/actual values. Suppose one has a very accurate model: this means that the error is very low, indicating a low bias and low variance (as seen on the top-left circle in Figure 1).
If the variance increases, the data are spread out more which results in lower accuracy (as seen on the top-right circle in Figure 1). In this case, the average model's error could be the same as in the first case but sometimes the error is greater and more spread out around the same mean value. If the bias increases, the error calculated increases (as seen on the bottom-left circle in Figure 1). High variance and high bias indicate that data are spread out with a high error (as seen on the bottom-right circle in Figure 1). This is a bias-variance tradeoff. In essence, bias is a measure of error between what the model captures and what the available data is showing, while variance is the error from sensitivity to small changes in the available data. A model having high variance captures random noise in the data.
For the field of interest of this paper, the most used ensemble methods are: • Random Forest (RF) (bagging ensemble method); • XGBoost or LightGBM (boosting ensemble method).
Very few papers have tested stacking solutions.

Literature Review of Review Paper for Each of the Fields of Interest in PV
In Table 1 are listed review papers concerning ML-based methods to forecast power production from PV; note that only recent papers, i.e., from 2018 till 2021, have been taken into consideration. Some notes for every paper listed in Table 1 summarize what the reader can expect from reading it.

Year
Reference Notes

[21]
A review of ML and statistical models based on historical data. Concludes that ANNs and Support-Vector Machines (SVMs) are the best-performing models, especially due to their capability to rapidly adapt to varying environmental conditions. Genetic Algorithms (GAs) result as the most frequently used method in optimizing forecasting models' hyper parameters.

[22]
A very interesting review, from the taxonomy point of view, of AI-based methods in solar power forecasting. Methods analyzed include ANNs, SVMs, Extreme Learning Machines (ELMs), Recurrent Neural Networks (RNNs), Long short-term memory (LSTM), RF, stacked Auto-Encoders, Generative Adversarial Networks (GANs), Fuzzy Logic (FL), Particle Swarm Optimization (PSO) and others. For each method is indicated their pros & cons and optimal field of application. This paper outlines challenges and future research directions, mainly: probabilistic prediction of solar energy, model explainability and prediction of the movement and thickness of clouds.

[23]
A review focused only on DL methods for renewable energy forecasting, both deterministic and probabilistic (deep belief network, stack auto-encoder, deep recurrent neural network, etc.) Forecasting horizon from 15 min ahead to 120 min ahead. Some notes on data preprocessing techniques 2020 [24] A comprehensive review of papers from 2008 till 2019 on ML, DL and hybrid models to forecast power production from PV. Interesting concluding remarks. Mainly focused on methods for point forecasting.

[25]
A comparison of state-of-the-art models to forecast PV power production focused on a horizon of 36 h in advance. Many models tested from simple linear regression (also Ridge, Lasso and Elastic Net), to the DT and ensemble models, both bagging (RF) and boosting (eXtreme Gradient Boosting). Robust 10-Fold Cross-Validation procedure to test each model's performance and grid search to find each model's optimal hyperparameters. All models were tested on a single dataset (plant located in Asia). Weather forecast and observations were used as model input. XGBoosting performed best.

[26]
A review focused only on three DL methods; LSTM, RNN, Gated Recurrent Unit (GRU) and a hybrid Convolutional Neural Network + LSTM (CNN+LSTM) to forecast solar irradiance and PV power production. Generally, LSTM performs overall the best but if enough data is available CNN+LSTM is the preferred model to choose. This paper highlights the use of RMSE as the most useful metric, allowing easy comparison of results.

[27]
A review of various reinforcement learning methods, both classical (multi-agent RL, etc.) and deep (Deep Q-network, etc.) in sustainable energy and electric systems. It is a more generic review not focused on PV but with a paragraph on MPPT worth reading to a general overview of RL.

Latest Research in PV Power Forecasting
This section describes the latest ML-based methods that have been employed in literature to forecast power production from PV, published from the year 2018 till the year 2021.
The vast majority of methods employed within this field are several types of NN architectures, but while older papers reported the use of shallow architectures such as multilayer perceptron (MLP) or Radial Basis Function (RBF) networks, research that is more recent has turned its interest to more advanced DL methods, such as LSTM, CNNs or a combination of both. Concerning the metrics used to assess models' performance, the most frequent are Mean Absolute Error (MAE) and Root Mean Square Error (RMSE). It is quite impossible to compare results from models applied to different scenarios, where the scenario has to be intended as the features of the plant under investigation (dimension, architecture in terms of the number of string, cells type, etc.), the environmental conditions, length of the training and test dataset, features pre-processing and/or features engineering, if applied, as the scenario greatly affects the model's performance results. This is true for metrics such as RMSE or MAE but also for percentage metrics such as Mean Average Percentage Error (MAPE) or Root Mean Squared Percentage Error (RMSPE) that are more suited to comparing models' performances related to different plants, being the last percentage errors. For a detailed discussion on models' metrics see [28,29].
While many researchers are more interested in providing a model capable of providing "point-forecast" results, i.e., the expected mean/average value for the forecasting horizon, some papers have been published concerning the "probabilistic/interval forecast", where, in addition to the point-forecast, the prediction interval associated to this point is provided [30]. For the reasons outlined before, PV power forecasting can be classified into basically two different types: The latter type, in the authors' opinion, could be more useful as in many situations it may be more critical to know not the future power production from the PV plant connected to the grid but rather to know, with a probability of 95% or 99%, that this expected level of production will not fall below a critical level.
The following tables, Tables 2 and 3, summarize the latest research in forecasting PV production by grouping the research into point and interval forecasts. As appears evident from Tables 2 and 3, the vast majority of papers are point-forecast and consider a short-term forecasting horizon. In this regard, the usual classification of models based on the forecasting horizon is the following: • Very short-term, from few seconds to some minutes; • Short-term, up to 48 or 73 h; • Medium-term, in the range from few days to one week; • Long-term, usually several months or one year.
Another, mostly equivalent, classification criteria relies on counting how many time steps ahead are considered in the forecasting horizon. Many papers are focused on only one step ahead (this usually could be a single hour or day), but multi-steps-ahead models are often the more interesting ones. A multi-step ahead model can produce results into an iterative model or with architectures able to provide, with a single run, or better a single inference computation, an array of values, each related to a specific timestamp (Pt + 1, Pt + 2, . . . Pt + h, where P is the forecasted power production, h is the forecasting horizon and t is the actual timestamp).
The remainder of this section is devoted to highlighting the novelty and/or the most interesting findings of each of the listed research papers. In [31] a multi-step-ahead prediction model focused on 1 to 16 steps ahead (with data sampled every 15 min and so resulting in a forecasting horizon from 15 min to 4 h) is obtained by a deep extreme learning machine (DELM) combined with enhanced colliding bodies optimization (ECBO) and Variational Mode Decomposition (VMD). The proposed model employs irradiance prediction from numerical weather prediction (NWP) and uses as the first step a grey correlation analysis coupled with Pearson correlation to find in the training data a day representative of/like the prediction day. In the second step, VMD and ECBO methods are employed to decompose the original power data that is fed to a Deep Extreme LM (DELM) to provide the final forecasts. The proposed DELM can be trained very fast if compared to a generic DL model. The decomposition method employed in this work is a novelty as most previous works rely on wavelets packet transform (WPT) or empirical mode decomposition (EMD). The model has been tested on a PV plant in China using a dataset of two years (2018-2019) with data sampled every 15 min and differentiating according to day conditions: sunny, cloudy and rainy. Authors claim very accurate results in the range of 4-8 steps ahead (1-2 h) but also state that a CNN+LSTM model can obtain results even better if enough data is provided, especially for longer forecasting horizons.
In the last several years, many different ML frameworks have been developed; this gives the opportunity to easily develop ML models and eventually deploy them in production an effective solution with ease. Some of these solutions provide what is known as Auto-ML (AML), i.e., an approach that can automatically select, train and optimize an ML model or eventually an ensemble of ML models. This is what is proposed in [32], where an AML is employed to derive an ensemble where the features used by each building model are derived using an improved GA optimization method capable of selecting optimal features for each region. In this work, historical data coming from PV plant data (panel temperature and power generation) and weather data (temperature, irradiance, cloud cover, precipitation and humidity) are used in conjunction with the results of a physical model that provide power production as a function of the tilted solar irradiance perpendicular to the solar PV panel, the temperature of the solar PV panel and the ambient temperature. The dataset used spans over 2 years, 2016-2017, with data sampled every 30 min, and is used for a multi-regional model, i.e., applied to data from plants at different regions. The ensemble selected by AML is made up of Elastic Net CV regression, Gradient Boosting Regression and RF Regression. Historical data of PV power plants located in Hokkaido (Northern Japan) from 1 January 2016 to 31 December 2017, is used for training while only one month is used for testing (December 2017). This is one of the very few papers assessing the viability of AML in forecasting PV production. Interestingly enough, the models selected to build up the ensemble were previously rarely used in this field.
While most of the research is focused on the very short-to short-term forecasting horizon, long-term PV production forecasting is investigated in [33] using a grey box prediction model. In detail, an adaptive discrete grey model with a time-varying parameter denoted as ATDGM(1,1) with a single variable and one order is used. This type of model does not require exogenous variables and belongs to the group of the model generally applicable to time series prediction problems. More on grey methods can be found in [34]. For the first time, to the best of our knowledge, the concept drift issue is discussed in the field of energy forecasting in [35]. This work is about solar and wind energy forecasting and not PV power production, but it has been included in this paper as it employs some techniques that can be easily adopted in the field of interest and because it takes into consideration a public dataset. An evolving Multivariate Fuzzy Time Series (e-MVFTS) is here adopted to forecast a time series and its potential has been evaluated in solar and wind energy using a public dataset made available by United States National Renewable Energy Laboratory (NREL) for solar energy data, and extracted from Global Energy Forecasting Competition 2012 (GEFCom2012). The wind energy dataset has been published on the Kaggle platform repository. To allow for the complete reproducibility of the results, all code and data were made publicly available. The proposed method, combining a forecasting model based on Fuzzy Time Series with an evolving clustering method based on Typicality and Eccentricity Data Analytics (TEDA), can adapt to the concept drift that occurs in the time series, i.e., can automatically deal with changes in the data distribution.
In [36] a hybrid model based on wavelet packet decomposition (WPD) and long shortterm memory (LSTM) networks is proposed which employs historical power and historical meteorological data as input variables, including global horizontal irradiance, diffuse horizontal irradiance, ambient temperature, wind speed and humidity. No forecasted irradiance is used in the model. WPD is applied to a PV original power series obtaining four "sub-series"; each new derived series, augmented with the meteorological data, constitutes the input of an LSTM whose results are linearly weighted to provide the final forecast. Each LSTM provides a multi-step prediction. An LSTM network is also used in [18] where its inputs include historical PV power data, historical weather predictions and synthetic weather forecasts derived using the k-means clustering method to provide multistep-ahead forecasts. The derived synthetic irradiance forecast results in an improvement into models' accuracy that varies from 33%, if compared to that when an hourly categorical type of sky forecast is used, to 44.6%, if compared to that when a daily type of sky forecast is used. This work claimed that the proposed LSTM DNN can perform better than the recurrent neural network (RNN), the generalized regression neural network (GRNN) and the ELM models. Again, a model with an LSTM DNN in conjunction with an RNN is applied to forecast PV production in [37]. This work introduces a time correlation modification (TCM) integrated with a partial daily pattern prediction (PDPP) framework. The main idea is that the ensemble resulting from LSTM-RNN+TCM can benefit both from the results of the time correlation model, which is closer to the actual data in trend, and from the results of the LSTM-RNN model more capable of tracking the fluctuations of PV power output. Finally, the DPP model is used to predict the pattern of the forecasting day so as to select an optimal set of weight coefficients to calculate the results using the output from both the LSTM-RNN model and TCM model.
As the authors claimed, the methodology of Transfer Learning (TF) firstly appears in a research paper in the field of PV production forecasting in [38]. Transfer learning is a known technique employed in DNNs that consists of using a complex but successful pre-trained DNN model to "transfer" what it has learned from its specific domain knowledge to a similar but different domain. Transfer learning has been extensively adopted in the field of image classification/recognition for convolutional neural networks (CNNs).
The advantages coming from TL related to the existing successful pre-trained model consists in:

•
Its hyper-parameters and network structure, i.e., number of layers and types, have already been tested and found to be successful; • The earlier layers of a CNN are essentially learning the basic features of the image sets such as edges, shapes, textures, etc. Only the last one or two layers of a CNN are performing the most complex tasks of summarizing the vectorized image data into the classification. Weights of the first layers are frozen while only the last layers are trained for the specific task in the target domain knowledge; this turns out to be a faster training method.
This idea in the field of PV power forecasting relies on transferring the knowledge of a pre-trained LSTM in the field of a historical irradiance time series to that of PV power series (irradiance being highly correlated to PV power) to cope with data scarcity in the target domain. Authors have obtained interesting results that demonstrate how TL can be very beneficial for a new plant where there is not enough historical data acquired.
To provide short-term predictions of PV power output, authors in [39] propose the use of an ensemble method, LighGBM, combined both with a Bayesian optimization algorithm to find optimal time steps for temporal pattern aggregation and a clustering-based training framework based on a tree-structured self-organized map (TS-SOM), proving its effectiveness in a production environment consisting of an edge computing platform (Raspberry Pi 3B) with limited storage. The proposed model, starting from historical meteorological data, applies three functional steps: a temporal pattern aggregation optimized using a Bayesian approach, a weather clustering, performed by TS-SOM, and the final model training using LightGBM. When compared with common DL alternatives such as GRNN and LSTM, authors showed that the proposed method performs better with a dramatic decrease of both training time and inference time. A hybrid model made up of a set of different ML-based methods is described in [40] to forecast PV power production in the short-term horizon. In the first step, an RF model is used to rank the input, weather-related (such as temperature, daily rainfall, horizontal radiation, diffuse horizontal radiation, etc.) features, then an improved grey ideal value approximation (IGIVA) model receiving results from RF as weight values searches for similar days of different weather types to improve the training data. Then, the original power series is decomposed by a complementary ensemble empirical mode decomposition (CEEMD) algorithm, while, to provide the short-term PV power generation, a backpropagation NN (BPNN) trained using a dynamic factor PSO method (DIFPSO) is used.
Again, the short-term horizon is investigated in [41] using an ensemble model made up of two LSTMs with Attention Mechanism (AM) working on the temperature and power time series, respectively, whose results are flattened and merged by a fully connected layer. The AM in DL is based on the concept of directing a model's main focus by paying greater attention to certain factors when processing the data. In broader terms, attention is one component of a network's architecture and is in charge of managing and quantifying the interdependence between the input and output elements (General Attention) or within the input elements (Self-Attention). Authors proved that AM can effectively improve LSTM performance.
The public dataset of the GEFCom14 competition is used to forecast PV generation for one day ahead with data sampled hourly in [42]. Here, an ensemble method with cluster analysis is proposed. A k-means algorithm is used to cluster solar generation, and the result of each cluster is used in an ensemble, by ridge regression, of RF models. Every ML-based method, being data-driven, needs an adequate amount of data; this means that, before being able to provide a forecast, it is necessary to acquire data for a non-negligible amount of time, ideally at least one year to take into account annual seasonality. In this regard, methods such as generative adversarial networks (GANs) could be useful to derive enough data for training an ML-based method. In [43] a recurrent generative adversarial network (R-GAN) is used to generate realistic energy consumption data by learning from real data. Although not strictly pertinent, this work has been included, as, in the authors' opinion, such an approach could be effectively used in the forecasting in the field of PV production, for, as an example, generating weather or power data for the rainy or cloudy conditions that are usually the conditions resulting in lower accuracy predictions.
While papers listed so far are related to what is known as "point-forecast", a far fewer number of papers have been published during the last several years concerning probabilistic forecasting. In this regard, some international forecasting contests, for example, M3 and M4 forecasting competitions, have contributed to encouraging the production of such types of forecasting results. These contests have highlighted some concepts, such as prediction interval (PI) and probability coverage, and some metrics more suitable for this type of forecast, such as pinball loss. For more information concerning this contest see [44][45][46]. In [47], the authors have provided a point-forecast with a confidence interval (CI) which quantifies the uncertainties associated with the forecasts delivered by mean of a bandwidth of possible changes and the certainty associated with each forecast. In this research, the authors employ a bootstrapping method to compute the CI. It is here interesting to highlight that confidence interval (CI) and prediction interval (PI) are completely different concepts, with the first being far narrower than the second (see [48,49]). In this paper the short-term forecasting horizon 1-6 h is explored; the main novelty resides in the considered PV plant size, a large multi-megawatt PV system (a 75 MW plat with 84 inverters), for which a new approach consisting of macro-level models results into a marginal improvement in accuracy compared to the usual inverter-level model approach. The proposed model uses an FFNN, an LSTM-RNN and a gated recurrent unit-RNN (GRU-RNN). The same CI criteria are used to provide a probabilistic description of the accuracy provided by a Gaussian Process Regression with Matérn 5/2 as kernel function in [50]. As commonly employed in the forecasting PV output field, the proposed model uses meteorological data (irradiance, temperature, and zenith and azimuth solar positions) and historical PV output as inputs. A k-means algorithm is used to cluster data into four groups based on solar output and time. The proposed model is validated using five PV plants data and both a five-fold CV procedure and a hold-out one (using 30 random days as a test). A first work more oriented on the probabilistic forecasting of PV production that summaries the models' accuracy in terms of the PI is [51]. Here, in the addition to the usual point forecasting metrics such as RMSE and MAE, prediction interval coverage probability and prediction interval normalized average width (PINAW) are introduced; the first metric estimates the predicted reliability, which is based on the probability that the real PV power is within the PIs, while the latter measures the width of the PIs. In this paper is proposed an hourly day-ahead forecasting horizon and sampling, and is introduced a CNN combined with a quantile regression (QR) method with a two-stage training strategy to cope with the non-differentiable loss function of QR. Results obtained with the described model are very interesting also in comparison to that obtained by a quantile extreme learning model (QELM), quantile echo state network (QESN), direct quantile regression (DQR) and RBFNNs.
Another researcher paper considering probabilistic forecasting is [52]; here, the authors use a hybrid model made up of a wavelet transform (WT) applied to historical PV power data and a RBFNN that is trained using a PSO algorithm. The proposed hybrid model provides the point forecast while constructing a PI is employed an indirect method: bootstrap. Results in PI using bootstrap are compared, using reliability diagrams, to direct and indirect QR; from this comparison, bootstrap emerges as a paramount factor in determining the better performing model.
An Analog Ensemble (AnEn) model is used in [53]; the authors, starting from the AnEn developed in [54], have further improved the metric herein adopted to allow the management of data, both from NWP and from satellite images (used to derive GHI time series data), where the probability density function (PDF) of the analogue ensemble is built up using a weighted kernel density estimation (KDE) method. Results are compared with a quantile regression forest (QRF) and a Bayesian Regression (BR) with Automatic Relevance Determination (ARD) prior models. Forecasting results are described in terms of PINAW and Continuous Ranked Probability Score (CPRS) and show how the proposed model performs better, compared to QRF and BR, for a forecasting horizon of fewer than two hours, while above this threshold QRF seems to perform better. The dataset used in the 2014 Global Energy forecast competition (GEFCom2014) is used in [55] to test a novel method able to provide a probabilistic forecast. The proposed method, named nearest neighbours quantile filter (NNQF), solves the problem of training quantile regressions with gradient-based optimization by deriving a modified training set. This modified training set can be used to train a generic regression model that directly outputs the conditional empirical q-quantile defined by the neighbours used in the training. The results achieved show that the proposed method obtains accuracies similar to those of the winners of the GEFCom14 competition, with a difference in terms of the pinball loss values obtained below 1%.

The Latest Research on Anomaly Detection (a.k.a. Fault Detection) and Diagnosis in PV
This section reports the latest research papers, i.e., published during the year 2018-2019, concerning anomaly detection (AD), in some papers also indicated as fault detection (FD), in PV.
This research field counts fewer papers if compared to papers concerning PV power forecasting, but it is a very interesting field in terms of the suitability of ML-based methods to automatically detect and classify anomalies or better provide predictive maintenance. PV plants are subject to many different faults during their life; these faults can lead simply to a power loss or even pose a hazard risk due to fires. To have the idea of the likelihood of power loss coming from faults, this can vary from 3.6% during the first year of life to 18.9% after three years of life, as stated in [57] that analyzed some domestic PV systems in the UK. Typical PV faults can be detected automatically using ML-based methods essentially using three methodologies: Analysis of string/panel current and/or voltage, or current/voltage measured at the inverter with the use of exogenous variables as environmental ones, 2.
Image analysis performed mainly by infrared (IFR) images detected by Unmanned Aerial Vehicle (UAV), 3.
Clustering-based techniques that can detect anomalies using unlabelled data.
For the methodology at point 1, the most frequently used methods include ANN, FL, Decision Tree (DT) and RF. For point 2 above, DL is the most suitable, and various types of CNN have been employed in this regard.
The third methodology reported above counts essentially k-Nearest Neighbour (kNN), one class SVM (1-SVM) or more recent algorithms as Isolation Forest (IS) or Local Outlier Factor (LOF). This field of research often deals with a dataset of unlabelled data and/or where the faults are, fortunately, very few, resulting in a highly unbalanced dataset (few faults and majority of data fault-free). For this reason, the normal accuracy metric is not well suited to accurately represent the model's performance. Nonetheless, many papers report only traditional accuracy while better metrics could be Balanced Accuracy, F1 score [58], Cohen's Kappa [59] or Matthews Correlation Coefficient (MCC) [60]. Moreover, for the reason outlined above, very often the dataset used to train and test the model is ad hoc simulated and not derived from a real plant; this can overcome the problems related to an unbalanced dataset, as many faults as desired can be created/simulated, and the issue concerning the labeling can be resolved, i.e., accurately describing what type of fault occurred and where and at which timestamp; but, at the same time, this could be not representative of a real functioning plant. It is probable that the optimal approach could be to employ both simulated and real data with ad-hoc created faults. The remainder of this section will present: • A discussion of anomalies/faults analyzed in literature with ML-based methods • Suggestions on which approach from the most current literature review (from 2018 till 2021) seems to produce better results • Common challenges and insight on possible future trends

Detectable Faults by ML-Based Methods
Faults in PV can be of different types; for in-depth analysis of faults that can adversely affect PV plants see [61,62].
In literature, the vast majority of works deal with four types of faults: short circuit (SC), open circuit (OC), partial shading (PS) and abnormal ageing. For these types of faults, the most employed solution is based on an MLP ANN that considers as inputs current or voltage related to string/array/panel, so the most frequent variables taken into account are voltage at MPP (V MPP ), current at MPP (I MPP ), OC voltage (V OC ) and SC current (I SC ), almost always supported by environmental variables such as ambient and module temperature and solar irradiance at the panel level. These models necessarily require a labelled dataset and are mainly based on the difference between the models' predicted system performance and the real measured one. Many ML-based models that employ SNN apply input pre-processing as Discrete Wavelet Transform (DWT); this is a typical form of feature engineering that has proven to be beneficial to improve the FD accuracy of the model. For the faults described so far, the models usually employed consist of SNNs of various typologies, but also DT ensembles such as RF or 1-SVM. Considering faults detectable using image analysis as module delamination/crack, hotspot or soiling (dust and birds' droppings), this is a field dominated by DL and especially CNNs trained on thermal infrared (IR) images acquired by UAV. For detecting faulty cells or modules electroluminescence (EL) images are also considered, while at the array level only IR images, generally EL images, embed more fault information and are the preferred type of images. The type of CNN used in this field varies from pre-trained known CNN architectures such as LeNet and VGG-16 to custom architecture. This is a field where Transfer Learning [38] can be very beneficial and where data augmentation techniques are also very common (image rotation, flip, etc.).
Although CNNs are particularly suited to dealing with 2d data, i.e., images (usually IR or EL), some interesting results have been obtained by treating a 1D signal, such as a current-voltage (I-V) curve, as a 2D feature using, for example, a scalogram and combining a CNN with an LSTM.
In Table 4 are reported some recent, always in the range 2018-2021, review papers dealing with ML-based models to detect faults/anomalies in PV. 2018 [66] A review of applicable methods, ML-based but also statistical-based, to FDD in PV. Highlights that most methods employ I-V curve data but also irradiance and module temperature.
2018 [62] An in-depth analysis of all major faults that can affect PV systems is accompanied by a complete list of methodologies that can be employed to detect and diagnose faults. Only a small section is devoted to ML-based methods.

[61]
After describing all major faults that can occur in PV, it focuses on FDD methods especially suited for faults occurring in a PV array: statistical, I-V analysis, power loss analysis, voltage and current measurement and AI-based. This paper concludes by highlighting the pro and cons of each method with some recommendations and insight into possible future trends.

[67]
Analyzes all major faults that can affect PV with a review of methods in the literature for PV fault monitoring and detection. Emphasizes how statistical methods do not require previous data but cannot identify failure types. On the other hand, numerical methods can detect failure types, but require knowledge of previous data. Knowledge model-based methods using residual current voltage or power can provide fault detection and identification but require historical data and also meteorological ones.
The remainder of this paragraph is devoted to the latest research paper dealing with ML methods for anomaly/fault detection in PV. Paper [68] focuses its attention on the detection of hotspots using a hybrid based SVM model trained using infrared thermography (IRT) images; it classifies panels into three categories: healthy, non-faulty hotspot and faulty hotspot. The novelty of this paper resides in the pre-processing phase of the IRT images acquired by handheld a FLIR camera horizontally aligned to PV panels of a PV system made up of 22 modules. The image feature extraction pipeline here proposed results in 41 features: 3 RGB, 12 contrast, 12 correlation, 3 energy, 1 Histogram of Oriented Gradient and 10 Local Binary Pattern. The feature extraction proposed results in an improvement in terms of accuracy results for the following classification algorithms: KNN, n-Bayes, Quadratic Discriminant Analysis (QDA) and bagging ensemble (BE). The SVM performed the best also in terms of computing time (k-fold CV methodology applied to derive all metrics). An LSTM NN is used in [69], combined with DWT as a feature extraction phase, to detect High Impedance Fault (HIF) and four other faults coming from an IEEE 13-bus system with a solar PV network simulated in MATLAB/Simulink. Results from the proposed LSTM as classifier are compared with other ML-based methods: SVM, Naïve Bayes, J48 Decision Tree. Models performance, defined utilizing several metrics (F-Measure, Recall, Precision, CM, Kapps Statistics) clearly show the LSTM model as the best performing.
Line-to-Line (LL) faults are automatically detected in [70] using an SVM model whose hyper-parameters are selected using GA. This model employs features extracted from DC I-V data resulting from a simulation model (developed with Matlab/Simulink) of a PV plant. GA is also used to extract optimal features for detect LL faults even in case of low mismatch and high impedance. A total of ten features are extracted from the simulated data, and all features are related to I-V curves under normal and fault events based on three points: short circuit current, MPP and open-circuit voltage. Results show as optimal the Gaussian kernel for the SVM model and two or three features from the whole set of ten. An emulated (not software but by dedicated hardware simulator) GCPV system is used in [71] to test a novel RK-RFKmeans and RK-RFED. Faults emulated at the grid side are open-circuit (F1) and standalone mode protection (F3), while on the PV side are poor connection and/or erroneous reading (F2), open-circuit/short-circuit/sudden disconnection (F5) and partial shading from 10-20% (F4). This paper introduced two new RF classifiers based on RK-RF that extract nonlinear features using a reduced kernel PCA (RK-PCA) technique to decrease the computational complexity of K-PCA for large data sets. The data reduction is based on two schemes; Euclidean distance metric and K-means clustering. Comparison with ML bases methods such as SVM, DT, ND, DA, KNN and RNN show that the two proposed methods perform very well.
A novel approach based on a 2D CNN is proposed in [72]; this CNN is trained with 2D scalograms from PV system data. This 2D CNN is proposed into two configurations: one derived from a pre-trained AlexNet CNN in which the last three layers are fine-tuned to provide a six-way classifier, and another where the results from a pre-trained AlexNet layer (fc7) are used with a classical classifier (RF and SVM). Faults considered detectable with the proposed approach are PS, LL, OC, arc-fault and faults (LL and OC) in PS. Good results are obtained from the fine-tuned AlexNet but also by the pre-trained AlexNet + SVM. This paper also outlines how data from MPPT (Imax and Vmax) are significant for obtaining good accuracy (performance halves without these data). In [73] is proposed a hierarchical model for anomaly detection and a multimodal classifier to recognize five common faults in PV. The anomaly detection is realized in two steps: an Auto Gaussian Mixture Model (Auto-GMM) acts as an unsupervised ML model to detect anomalies, and this is further filtered using an auto-thresholding methodology applied to a local anomaly index (LAI) that is derived for each probable anomaly. For the classification, the authors propose a multimodal feature extraction procedure based on the Fourier spectrum derived from PV strings currents. Three classifiers are compared to classify five common PV faults: SVM, bagging and XGBoost. With the extracted multimodal features, the XGBoost model has proved to perform the best.
In Table 5 are reported some recent review papers dealing with ML-based models for fault/anomaly detection and diagnosis in PV. Software simulation + laboratory PV system RF using only voltage and string currents from PV array optimized with grid search (out-of-bag accuracy)

The Latest Research on MPPT in PV
Apart from its application, PV are expected to be operated in a manner such that maximum power can be extracted from the installed system.
The energy output of a PV system is sensitive to variations in weather conditions; in particular, it is dependent on solar radiation and temperature. Variations in cloud cover, fog and heat affect the PV system's conversion efficiency. Dust and other particles floating in the air or covering the panel can drastically decrease the efficiency of the power conversion process as well [76].
Under these conditions, the power-voltage curve of the PV array exhibits multiple local maximum power points (MPPs). However, only one of these MPPs corresponds to the global MPP (GMPP), where the PV array produces the maximum total power [77]. (Figure 2). Any change in the output voltage because of the change of load or other reasons will cause the PV panel to produce less power than the maximum. Therefore, the controller of the power converter that is connected at the output of the PV array must execute an effective global MPP tracking (GMPPT) process to continuously operate the PV array at the GMPP during continuously changing weather conditions. Appl. Sci. 2021, 11, x FOR PEER REVIEW 20 of 41

The Latest Research on MPPT in PV
Apart from its application, PV are expected to be operated in a manner such that maximum power can be extracted from the installed system.
The energy output of a PV system is sensitive to variations in weather conditions; in particular, it is dependent on solar radiation and temperature. Variations in cloud cover, fog and heat affect the PV system's conversion efficiency. Dust and other particles floating in the air or covering the panel can drastically decrease the efficiency of the power conversion process as well [76].
Under these conditions, the power-voltage curve of the PV array exhibits multiple local maximum power points (MPPs). However, only one of these MPPs corresponds to the global MPP (GMPP), where the PV array produces the maximum total power [77]. (Figure 2). Any change in the output voltage because of the change of load or other reasons will cause the PV panel to produce less power than the maximum. Therefore, the controller of the power converter that is connected at the output of the PV array must execute an effective global MPP tracking (GMPPT) process to continuously operate the PV array at the GMPP during continuously changing weather conditions. Consequently, many research efforts are focused on finding ways to drive PV panels to their maximum output power at all weather conditions, thus ensuring their profitability [78].
In Table 6 a list of papers that provide a review on PV MPPT techniques is shown. Year Reference Notes 2021 [79] The paper provides a comparative and comprehensive review of some relevant PSO-based methods taking into account the effects of important key issues such as particles initialization criteria, search space, convergence speed, initial parameters, performance with and without partial shading and efficiency.

[80]
The paper intends to review the previous articles and provide a proper division, performance method. This explains the performance, application, advantages and disadvantages of algorithms to be a good reference for selecting the appropriate algorithm. Algorithms in the presented paper are divided into four categories methods based on measurement, calculation, intelligent schemes and hybrid schemes.

[81]
The paper represents a review of two modern techniques used in solar photovoltaic systems which enhance the extraction of maximum output power in an efficient manner. The Artificial Intelligence-Based MPPT Techniques for PV Applications and a Forecasting System of Solar PV Power Consequently, many research efforts are focused on finding ways to drive PV panels to their maximum output power at all weather conditions, thus ensuring their profitability [78].
In Table 6 a list of papers that provide a review on PV MPPT techniques is shown. The paper provides a comparative and comprehensive review of some relevant PSO-based methods taking into account the effects of important key issues such as particles initialization criteria, search space, convergence speed, initial parameters, performance with and without partial shading and efficiency.

[80]
The paper intends to review the previous articles and provide a proper division, performance method. This explains the performance, application, advantages and disadvantages of algorithms to be a good reference for selecting the appropriate algorithm. Algorithms in the presented paper are divided into four categories methods based on measurement, calculation, intelligent schemes and hybrid schemes.

[81]
The paper represents a review of two modern techniques used in solar photovoltaic systems which enhance the extraction of maximum output power in an efficient manner. The Artificial Intelligence-Based MPPT Techniques for PV Applications and a Forecasting System of Solar PV Power Generation using Wavelet Decomposition and Bias-compensated RF are reviewed and compared in the paper.

[82]
The paper presents an organized and concise review of MPPT techniques implemented for the PV systems in literature along with recent publications on various hardware design methodologies. Their classification is done into four categories, i.e., classical, intelligent, optimal and hybrid depending on the tracking algorithm utilized to track MPP under PSCs.

[83]
The review of MPPT techniques proposed in the paper has been grouped into two groups. The first group includes all the benchmark facilities. The second group includes the intelligent techniques that explain the fuzzy-based MPPT, ANN-based MPPT evolutionary techniques, hybrid methods and MPPT techniques used in energy harvesting. 2020 [84] In the presented paper, a compendious study of different Swarm Intelligence (SI)-based MPPT algorithms for PV systems feasible under partially shaded conditions are presented. The methods are compared in terms of their swarm intelligence and advantages. 2020 [86] The presented study gives an extensive review of 23 MPPT techniques present in literature along with recent publications on various hardware design methodologies. MPPT classification is done into three categories, i.e., Classical, Intelligent and Optimisation depending on the tracking algorithm utilised. During uniform insolation, classical methods are highly preferred as there is only one peak in the P-V curve. The paper furnishes the hardware information of the particular technique by different authors performed in various platforms with their tracking speeds and efficiencies. In addition, the parameters of these techniques, their flowcharts and a clear explanation of MPPT algorithm implementation are explained in brief. The fundamental objective is to give ongoing innovation advancements in MPPT techniques.
2020 [87] The main MPPT techniques for PV systems are reviewed and summarized and divided into three groups according to their control theoretic and optimization principles: Traditional MPPT methods, MPPT methods based on intelligent control and MPPT methods under PSCs. In particular, the advantages and disadvantages of the MPPT techniques for PV systems under PSCs are compared and analyzed.
2020 [88] This paper reviews (extensively) the most used MPPT algorithms. They are classified into three groups: (1)  2019 [89] This study provides an extensive review of the current status of MPPT methods for PV systems which are classified into eight categories (methods based on mathematical calculations, constant parameters-based methods, measurement and comparison-based methods, trial and error based methods, numerical methods, intelligent prediction based methods and methods based on iterative in nature). The categorization is based on the tracking characteristics of the discussed methods. The novelty of this study is that it focuses on the key characteristics and 11 selection parameters of the methods to make a comprehensive analysis, which is not considered together in any review works so far. Again, the pros and cons, classification and immense comparison among them described in this study can be used as a reference to address the gaps for further research in this field. A comparative review in tabular form is also presented at the end of the discussion of each category to evaluate the performance of these methods, which will help in selecting the appropriate technique for any specific application.
2018 [90] The paper focuses mainly on a review of advancements of MPPT techniques of PV systems subjected to partial shading conditions (PSC) to help the users to make the right choice when designing their system. The choice of MPPT depends on several parameters such as the application, hardware availability, cost, convergence speed, precision, and system reliability.
MPPT methods can be classified into indirect and direct methods [91]. The indirect methods, such as open-circuit and short-circuit methods, require prior knowledge of the PV array characteristics or are based on mathematical relationships which do not meet all meteorological conditions. Therefore, they cannot precisely track the MPP of the PV array at any irradiance and cell temperature. For this kind of method, temperature and irradiance must be used as sensed parameters, but their measurement requires expensive devices that have to be placed throughout the PV array to obtain the values of such variables for each panel or group of them, thus making the measurement very expensive, especially for large PV plants. On the other hand, direct methods work under any meteorological condition. The most used direct methods are [6]: P&O, IncCond and ML-based MPPT methods. These methods control the reference signal of a DC-DC converter that matches the PV module voltage with that of the DC bus or works as a battery charge [7]. In the P&O method, the controller adjusts the voltage by a small amount and observes the power change; if the power increases, it adjusts the operating voltage in that direction until the output power no longer increases. The IncCond method is based on the fact that the slope of the power-voltage curve characterizing the circuits of the PV array is zero at the MPP, positive on the left and negative on the right of the MPP. The controller evaluates the effect of a voltage adjustment by measuring the incremental changes in PV array output. However, the effectiveness of P&O and IncCond methods is limited due to steady-state oscillation and diverged tracking direction, and they can even fail to identify the global optimal power point under some special conditions, such as an abrupt irradiance change due to shading. Therefore, more intelligent MPPT techniques based on machine learning methods have been proposed for better transient and steady-state performance. Intelligent techniques (i.e., FL and ANN-based MPPT methods) are more efficient and they have fast responses, but they are more complex compared to the conventional techniques that are generally simple, cheap and less efficient [91]. ANN-based methods have shown their advantages under rapidly varying irradiance [92], especially regarding response efficiency. However, despite their higher efficiency, these advanced heuristic approaches are much more complex compared to the conventional techniques. The performance of the ML approaches is heavily dependent on the accuracy of the trained model that is determined by the quality of training data, and frequent calibration is needed as the system evolves.
In Table 7, several papers that use ML approaches to improve MPPT performance have been analyzed. They have been ordered based on the year of publication. In particular, the table is useful to underline the ML method that has been used most frequently and the results that the different approaches allow to obtain. Unfortunately, results are not always presented in such a way they can be compared with other similar papers.
In particular, some papers present results comparing the value of the power that the proposed solution allows reaching with the value of the power of the global MPP [77,[93][94][95]. In these cases, to compare results obtained in the different papers, the ratio between the reached power, P reached , and that one that should be obtained, P GMPP , has been calculated as: In some papers, other statistical errors have been used to compare the reached power with that one at MPP: the Mean Error (ME in Equation (2)) [96], the Mean Square Error (MSE in Equation (3)) [96], the Standard Deviation error (σ in Equation (4)) [96], the Root Mean Square Error (RMSE in Equation (5)) [76,97], means absolute error (MAE in Equation (6)) [97], the overall power tracking efficiency (η in Equation (7)) [98] and a quality indicator that provides information about the ability of the ANN to predict the MPP (QI1 in Equation (8)) [99]: where N is the number of tests and µ is the average of the reached values. ANN, segmentation-based approach and hill-climbing The paper deals with the feasibility study and implementation of a novel easy and cost-effective hybrid two-stage GMPPT algorithm. The first stage synergically combines two different methods to predict the optimal operating condition: an ANN-based algorithm and a segmentation-based approach. A traditional hill-climbing method is used in the second stage to finely track MPP. Various ANN structures have been implemented and tested.   The paper deals with the feasibility study and implementation of a novel easy and cost-effective hybrid two-stage GMPPT algorithm. The first stage synergically combines two different methods to predict the optimal operating condition: an ANN-based algorithm and a segmentation-based approach. A traditional hill-climbing method is used in the second stage to finely track MPP. Various ANN structures have been implemented and tested. The authors propose a simple MPPT algorithm that is based on the neural network (NN) model of the photovoltaic module. The expression for the output current of the NN model is used to develop an analytical, gradient MPPT algorithm which can provide high prediction accuracy of the maximal power. Finally, to avoid the usage of the pyranometer, a simple irradiance estimator, which is also based on the identified NN model, has been proposed.
The presented algorithm has smaller computational complexity compared to the other NN-based MPPT algorithms, in which the MPP position is predicted by one multilayer NN or by two single-layer NNs.
Relative error between the predicted and true maximal power:  Two reinforcement learning-based MPPT (RL MPPT) methods are proposed by the use of the Q-learning algorithm. One constructs the Q-table and the other adopts the Q-network. These two proposed methods do not require the information of an actual PV module in advance and can track the MPP through offline training in two phases: the learning phase and the tracking phase. A Markov decision process model is suitable for describing the interaction between the circuit connected to the PV module and the controller. An MDP model consists of four elements, which are state, action, transition and reward. With the MDP model described, an RL-QT MPPT method is proposed by constructing the Q-table to perform MPPT control. However, the state representation is needed to be discretized for the tabular method, which may cause the loss of MPPT control accuracy. Therefore, a Q-network-based MPPT method is proposed. In the RL-QN MPPT method, the Q-table is approximated by a neural network, so that the discretization of the states is not needed. The paper presents a novel hybrid technique for tracking the maximum power point of the photovoltaic panel. This approach includes two loops: the first one is the ANN loop that is used to quickly predict the desired voltage, which minimizes the calculation and allows a rapid system response. While the second loop consists of a combination of the sliding mode and the backstepping control approaches, the main aim is to track the reference voltage that is generated by the ANN loop, the second purpose is to have a rapid, robust and accurate system under various and difficult changes of meteorological conditions. The proposed technique is compared with the conventional algorithms and the hybrid controllers, ANN combined with the Integral sliding mode controller and ANN combined with the backstepping controller, to prove its effectiveness and tracking performance.  A customized MPPT design was proposed to determine the optimal step sizes according to three different weather types. The weather-type labelling was automatically provided by a supervised learning classification system. Two classical machine learning technologies were employed and compared, including SVM and ELM. The classification probability from SVM or ELM is deployed as the confidence level and is designed as a fuzzy-weighted classification system to further improve the MPPT design.  As it is possible to note in Table 7, almost all the papers propose simulations to test their algorithms. Only in [78] do authors propose both simulation and experimental results. This can be because ML algorithms have a computational load that is hardly in accordance with the characteristics of the hardware that can be used in PV fields.

Other Applications in the PV Field
In few cases, ML algorithms have been used to improve the performance of concentrating PV (CPV).
In particular, in [115] authors studied a Random-Forest (RF) model for the temperature analysis of two different triple-junction solar cells mounted on an experimental CPV system. The cell temperature evaluation is a basic parameter to determine the energy production of a CPV/T system. Moreover, an ANN model and an LRM have been also studied to compare the RF model results in terms of absolute error and fit capability. The RF model to evaluate the performances of a CPV system from electric and thermal presents the lowest values for RMSE, MAE and MAPE. In particular, RMSE is 1.95 • C, the MAE is 1.17 • C and the MAPE is 3.67%. These values are two or three times lower than the LR and ANN models results. However, it should be noted that the ANN model shows better statistical results with respect to the LRM. This proves that a non-linear method represents a better solution than a linear one for the cell temperature evaluation. The good forecast capability of the RF technique is also proved by the values of the goodness of fit(R2). In particular, the estimated values are 0.95, 0.79 and 0.76, respectively, for the RF, ANN and LRM models. Finally, the RF model constitutes the best method both in terms of absolute error and fit capability.
Another paper where ML algorithms are used in the field of CPV is [116]. The authors developed four machine-learning algorithms (support vector machine, ANN, kernel, nearest-neighbour and deep learning) to predict the power outputs of a CPV system. The authors concluded that all machine learning algorithms used in the paper can successfully predict PV module output power. However, the SVM algorithm performed reasonably well throughout the day. The k-NN algorithm shows a prediction trend similar to SVM at the beginning of the observations. However, it is possible to say that this algorithm gives a better result than SVM, especially in the initial and final observations. As the predictions with ANN are analyzed, it is seen that this algorithm is successful in predicting peak points as in the SVM algorithm. On the other hand, the DL approach predicted power output higher than the measured value. On the other hand, the reason for the higher deviation in the DL algorithm is probably related to the availability of data.
In [117] an RBFNN is used in the field of CPV. More specifically, it is used to predict the output power of a high CPV (HCPV) facility. The RBFNN has been designed using MATLAB. Two coefficients have been used to verify the accuracy of the adopted solution, the RMSE and the R2. The results were compared to those obtained by the ASTM E-2527 model using the same dataset. Results have been divided for sunny and cloudy days, obtaining an RMSE of 3.3 kW and 4 kW, respectively, in the case that the ASTM model is used. In the case of R, BFNN the RMSE is equal to 1.3 kw for sunny days and 2.24 kW for cloudy days. The obtained value of R2 is 0.322 for sunny days and 0.339 for cloudy days in the case of the ASTM model.
Another application in the PV field where ML has been used to improve the performance of the system is PV/T hybrid systems. They consist of conventional thermal collectors with an absorber covered by a PV layer. The PV modules produce electricity and simultaneously the absorbed thermal energy is transported away by the working fluids.
In [118] different PV/T systems (conventional PV, water-based PV/T, water-nanofluid PV/T and nanofluid/nano-PCM) under the same conditions and environment have been tested using one ANN-based MLP system. The parameters used in simulating the neural models were input parameters such as Solar Irradiation and Ambient Temperature, whereas the output parameters were PV/T Current (A), PV/T Voltage (V), PV/T Electrical Efficiency (%) and PV/T Thermal Efficiency (%). The MLP based on the backpropagation algorithm using momentum learning function was used. The proposed ANN approach proved that using nanofluid/nano-PCM enhanced the electrical efficiency from 8.07% to 13.32% and its thermal efficiency reached 72%.
In [119] authors examined the feasibility of several ML techniques to forecast the energetic performance of a building-integrated PV/T (BIPV/T) collector. In particular, they tested the following techniques: multiple linear regression (MLR), MLP, RBF regressor, sequential minimal optimization improved support vector machine (SMO Improved), lazy.IBK, RF and random tree (RT). Moreover, it implements the performance evaluation criteria (PEC) to evaluate the system's performance from the perspective of exergy. The results for the testing dataset showed the RF model is superior to other proposed models with an RMSE equal to 0.8153 compared with an RMSE of 18 The prediction of thermal efficiency of PV/T setups is studied in [120], regarding input temperature, recirculation flow rate and solar irradiation by modifying MLP-ANN, ANFIS and least-squares SVM (LSSVM) approaches. An experimental dataset of 100 data points (empirical measurements performed on a fabricated water-cooled PV/T setup) has been used. Graphical and statistical methods were employed to determine the credibility of the proposed models in accurate prediction of the thermal efficiency. The proposed ANN model provided the best performance compared to ANFIS and LSSVM models due to the MSE and R2 values of 0.009 and 1.00, respectively.
In [121] ML methods of ANNs (MLP-ANN and RBF-ANN), LSSVM and ANFIS have been used for advancing prediction models for the thermal performance of a PV/T solar collector. As the input variables for the proposed models, authors have considered: inlet temperature, flow rate, heat, solar radiation and the sun heat. The electrical efficiency yield has been used as the output. The data set has been extracted through experimental measurements from a novel solar collector system. Results show that the proposed LSSVM model outperforms other models with an R2 equal to 0.987 and an MSE equal to 0.004. Further, in [121] the sensitivity analysis demonstrates that the water inlet temperature has the most significant relevancy factor and therefore it is the parameter that most affects the efficiency of the PV/T system. Finally, in [122] a comparison study of prediction data system of PV/T output power by using ANN techniques considered published studies in data sets for the years 2008-2017. The results show that ANN models are the most suitable for the prediction of global solar radiation. The presented study offers a cheap and easy method for implementing PV models and choosing the desired location for providing good performance for the system. Several models were used to simulate and measure the production of energy in solar cells including ANN such as MLP, Bayesian NN (BNN), RNN (Recurrent NN), Generalized Feed-Forward (GFF), SVM, Self-organization feature map (SOFM) and LSTM. To obtain the most significant benefit from the best model, several mathematical coefficients have been adopted to determine the validity of accurate results such as MAPE, MSE, RMSE, MBE, Mean Percentage Error (MPE) and R2. The comparative study can clarify the best implementation method and address the scope of weakness in any of the proposed models based on the results of scientific verification and operation.
Furthermore, in [123,124] ML is used to improve the efficiency of control algorithms used in PV-storage systems. In particular, in [123] an algorithm using ML to effectively control PV-storage systems has been developed. The algorithm uses an offline policy planning stage and an online policy execution stage. In the planning stage, a suitable machine learning technique is used to generate models that map states (inputs) and decisions (outputs) using training data. In the execution stage, the model generated by the ML algorithm is then used to generate fast real-time decisions.
In [124] authors introduce a supervised ML approach to predict and schedule the real-time operation mode of the next operation interval for residential PV/battery systems controlled by mode-based controllers. The performance of the mode-based economic model-predictive control approach is used as the benchmark. The optimal operation mode for each control interval is first derived from the historical data used as the training set. Then, four ML algorithms (i.e., ANN, SVM, logistic regression and RF algorithms) are applied. Simulation results show that using the ML approach can effectively improve the performance of the mode-based control system and reduce the computation effort of local controllers because the training can be completed on a cloud-based ML engine.

Concluding Remarks and Future Trends
In this paper, a literature review of recent (from the year 2018 till 2021) applications of ML methods on many different fields of PV has been carried out. Fields touched within this discussion are forecasting of PV production, anomalies detection and fault analysis, tracking MPP, PV systems efficiency optimization, PV/T and CPV system design parameters optimization and efficiency improvement and energy management of PV storage systems. In almost all fields reported above, ML methods have proven to be an effective and reliable solution. The field of forecasting PV production is, by far, the most investigated one where many ML-based models have been proposed. Most research papers in this field are focused on point-forecast, though in the last several years some papers have also evaluated probabilistic forecasting that, in the authors' opinion, is the most interesting as provide the additional prediction interval associated with point forecasting. Due to the rise of DL and to the availability of ML frameworks such as Tensorflow or Pytorch, to cite a few, many DL models, mainly based on LSTM architectures, have proven to provide state of the art accuracy. These LSTM architectures mostly use historical values of PV production as well as environmental features and some techniques of an analogue ensemble. For probabilistic forecasting, methods based on some variation of Quantile regressions are the most common. Regarding forecasting horizon, the short-term horizon, from one hour to few days, is the most investigated. In this field, some ensemble methods such as LGBM have shown promising results. Only a few papers have investigated techniques such as TF and AML, techniques that could be of great usefulness in future applications, especially thinking on real-time applications. Regarding anomalies detection and fault analysis, the reviewed models employ electrical features and/or images (usually IFR or EL). Typically, SNNs, mainly MLP, RBF and ELMs, are employed in the first case while, as could be expected, DNNs, mainly CNNs, are more common for the detection of permanently visible faults. The models tested in the FDD field employ for most cases a simulated dataset and the more frequent faults taken into consideration are short circuits and partial shading. Regarding the metrics used to evaluate models for FDD, only a few papers correctly employ a broader range of metrics (e.g., F1, balanced accuracy and MCC) apart from the common accuracy that is adequate only for a balanced dataset. Transfer Learning is a useful technique that probably will be more and more adopted in FDD, at least for models employing images as features. In this field it would be advisable to promote sharing of public datasets (a common repository for images related to faults in PV, IFR and EL images). Ensemble models such as RF have seen only a few applications in the field of FDD but seem very promising. For all fields here analyzed, and more generally for research papers, it would be desirable to promote reproducible research results using technologies based on containers as Docker.