A New Data-Based Dust Estimation Unit for PV Panels

: Solar photovoltaic (PV) is playing a major role in the United Arab Emirates (UAE) smart grid infrastructure. However, one of the challenges facing PV-based energy systems is the dust accumulation on solar panels. Dust accumulation on solar panels results in a high degradation in the output power. The UAE has low intensity rainfall and wind velocity; therefore solar panels must be cleaned manually or using automated cleaning methods. Estimating dust accumulation on solar panels will increase the output power and reduce maintenance costs by initiating cleaning actions only when required. In this paper, the impact of natural dust accumulation on solar panels is investigated using ﬁeld measurements and regression modeling. Experimental data were collected under various real weather conditions and controlled levels of dust. Moreover, this paper proposes a data-driven approach based on machine learning to estimate the accumulated dust level on solar panels. In this approach, a dust estimation unit based on a regression tree model has been developed to estimate the dust accumulation. This unit is trained using experimental records of solar irradiance, ambient temperature, and the output power generated from solar panels as well as the amount of dust at these conditions. The proposed unit is evaluated through di ﬀ erent case studies with a random amount of dust applied to the solar panels to demonstrate the accurate performance of the proposed unit. dust accumulation on solar photovoltaic panels and to collect the dataset required to develop the proposed dust estimation unit. The output power of the solar PV panel was measured at di ﬀ erent times of the day under various real environmental conditions of solar irradiance, temperature, and dust levels through the conducted experiments. Second, the collected data from the experiments are used in training di ﬀ erent regression models. The model with the least RMSE is utilized in developing the dust estimation unit. The regression model based on the ﬁne tree algorithm provides the least RMSE. Therefore, the dust estimation unit is developed based on the ﬁne tree algorithm. A threshold value is selected and once the dust level exceeds this value, which leads to high degradation in the output power from solar panels, the cleaning procedure is initiated. Various case studies are performed to assess the accuracy of the proposed model to predict the amount of dust accumulation on solar panels. A comparison between the performance of the proposed unit and another unit based on ANN is presented. The results reveal that the proposed ﬁne tree dust estimation unit is able to predict the dust level with a low false alarm rate and high performance. Analysis of economic yield derived from the obtained results in this paper will be investigated as a core of a new paper. and M.F.S.; writing—review editing, A.R.A.-A., A.H.O., M.M., M.F.S. and U.T.; visualization, M.M., and M.F.S.; supervision, M.F.S. and U.T.; project administration, M.F.S.; funding acquisition, A.R.A.-A., A.H.O., M.F.S.


Introduction
Thermal power plants have a great impact on global warming. Furthermore, the fossil fuels used in such plants have an increasing cost and are limited in nature. Therefore, the need for integrating renewable energy into power grids has become an inevitable issue worldwide. Various renewable resources can be integrated into power grids, for example solar power, wind, biomass, geothermal and fuel cells.
In the Middle East, local demand for oil and gas has increased rapidly pushing the countries to look for other sources to generate electricity. Renewable energy and solar power technologies are currently considered a feasible alternative energy source in the Middle East and other countries around the world. The Middle East region has an appropriate climate and geographical location for producing the highest solar power generation in the world [1]. Therefore, solar power is the most commonly used renewable energy source in the region. The expansion of using solar energy technology has the first priority in the energy strategic plans for many countries in the region. The United Arab Emirates is also eager to integrate solar energy into electric power grids. Currently, different solar projects have tendered in both Dubai and Abu Dhabi with expected growth of solar energy in the coming years. The United Arab Emirates (UAE) aims to generate 30% of the power demand from renewable sources by 2030. Dubai aims to install 1000 MW of concentrating solar power (CSP) technology and 4000 MW of solar photovoltaic (PV) technology by 2030 [2]. The UAE has high solar irradiance throughout the year. It is worthwhile to note that the average global solar radiation ranges from 1900 to 2300 kWh/m 2 [3].
Photovoltaic panels are manufactured using semi-conductor materials. The inherent properties of these materials affect the efficiency of photovoltaic systems. Therefore, the output power of solar panels depends on the materials used to manufacture the panels and the coating on the glass. Furthermore, the design of the power plants also affects the output of solar panels, such as the orientation of the installed panels, the amount of sun exposure to the power plant and the sun tracking systems. Figure 1 illustrates the factors affecting the yield of solar photovoltaic panels. In this context, there are many challenges facing the spread of solar energy in the world and particularly in the Middle East. The dust effect is the most salient factor. The dust accumulation rate in the UAE is high due to the geographic location of the country. Dust accumulation on photovoltaic panels leads to high degradation in the efficiency of the solar panel and hence a significant reduction in the output power. Dust accumulation is influenced by two factors; environmental conditions and dust properties. Dust properties refer to the weight, shape, and size of the particle, while environmental conditions include weather conditions and geographical location [4]. Wind is another factor that affects the settlement of dust. High-speed wind can remove the dust settled on the solar photovoltaic panels, while areas with low wind speed suffer from the worst degradation in solar photovoltaic output power due to the high accumulation of dust. The ambient temperature and humidity also affect dust settlement. High temperature and humidity areas suffer from higher dust accumulation on solar panels as the dust becomes wet with high humidity and sticks to the glass of the solar panels, and hence reduces their efficiency. Consequently, cleaning the panels on a daily basis guarantees high efficiency of solar panels. However, this procedure is not economical and may lead to wasted resources if no cleaning actions are required [5]. Therefore, detecting the dust levels and initiating the necessary cleaning actions is vital to increase the yield of photovoltaic panels and to reduce maintenance cost. Note that, dust estimation can be implemented using numerical methods, analyzing satellite images or machine learning models.
The effect of dust accumulation on solar photovoltaic panels is location-dependent as it depends on weather conditions and dust particles' size. Moreover, the behavior and response of the output power of PV panels differ with different locations, dust properties, and the environment. Therefore, studies done in one country cannot be generalized to other countries. However, the process proposed in this paper can be applied in other countries but not the data and results. Thus, detecting the amount of dust and initiating the necessary cleaning action when needed is vital to increase the yield of photovoltaic farms and to reduce maintenance cost. However, the methodology applied in this paper is distinguished by generality, which can be used to detect the dust accumulation whatever the location.
In this paper, experiments are conducted to measure the PV panels' output power at various dust accumulation levels, temperature and solar irradiance. Furthermore, the outcomes obtained from experiments will be utilized in developing the dust estimation unit based on machine learning. The main contributions of this paper are as follows: • Study the impact of dust accumulation on the output power of solar photovoltaic panels experimentally under real environmental conditions in the UAE. • Propose a dust estimation unit based on a regression tree that estimates the amount of dust accumulated on the solar photovoltaic panel to initiate the cleaning actions.

•
The detector is developed using a field measurements dataset that includes the solar irradiance, the ambient temperature, PV panels' output power as the main predictors in addition to the amount of dust as the target or response variable.

•
The proposed detector is evaluated through different case studies, including premeasured amounts of dust as well as random amounts of dust applied on solar panels. Moreover, the performance of the proposed dust estimation unit is compared with another unit based on Artificial Neural Network (ANN) to demonstrate the potential of the proposed unit.
The rest of paper is organized in six sections. Section 2 provides a background and literature review. Section 3 presents the proposed methodology, while the experimental setup is presented in Section 4. The results and the different cases used to verify the proposed model are discussed in Section 5. Finally, Section 6 includes the conclusion. accumulated on the solar photovoltaic panel to initiate the cleaning actions.

•
The detector is developed using a field measurements dataset that includes the solar irradiance, the ambient temperature, PV panels' output power as the main predictors in addition to the amount of dust as the target or response variable.
• The proposed detector is evaluated through different case studies, including premeasured amounts of dust as well as random amounts of dust applied on solar panels. Moreover, the performance of the proposed dust estimation unit is compared with another unit based on Artificial Neural Network (ANN) to demonstrate the potential of the proposed unit.
The rest of paper is organized in six sections. Section 2 provides a background and literature review. Section 3 presents the proposed methodology, while the experimental setup is presented in Section 4. The results and the different cases used to verify the proposed model are discussed in Section 5. Finally, Section 6 includes the conclusion.

Related Work
The different approaches presented in the literature to study the impact of the environment on the output of solar photovoltaic were either by conducting experiments or developing prediction models. In [6], a review study was conducted to survey the impact of dust, humidity and air velocity separately and as a group. The survey determined that the efficiency of solar panels drops significantly with fine particles compared to coarse particles. Furthermore, a larger tilt angle resulted in less dust accumulation; on the other hand, humidity resulted in more dust coagulation. In [7], experiments were implemented where the pollutant type and weight were the variables in the

Related Work
The different approaches presented in the literature to study the impact of the environment on the output of solar photovoltaic were either by conducting experiments or developing prediction models. In [6], a review study was conducted to survey the impact of dust, humidity and air velocity separately and as a group. The survey determined that the efficiency of solar panels drops significantly with fine particles compared to coarse particles. Furthermore, a larger tilt angle resulted in less dust accumulation; on the other hand, humidity resulted in more dust coagulation. In [7], experiments were implemented where the pollutant type and weight were the variables in the experiments to illustrate their impact on the PV panels' output power. In [8], three different artificial pollutants were applied on the solar photovoltaic panel with different masses. The pollutants used in the experiments were red soil, limestone and carbonaceous fly-ash particles. The results of the experiment showed that Red soil, Limestone and ash resulted in a reduction of 19%, 10% and 6% in the output power from solar panel compared to a clean panel, respectively. A similar study was conducted on a three different artificial pollutants which were red soil, sand and ash in [9]. In [3], a 23-day study was carried out in the UAE to study the effectiveness of self-cleaning coating material on solar panels. The results showed that panels with and without this coating material have almost the same performance. In [10], another study was conducted in the UAE to study the dust effect on a solar photovoltaic panel. Dust amounts of 0.0063 g/m 2 to 0.36 g/m 2 were distributed on solar panel modules. The results showed that the relation between the dust weight and the drop-in output photovoltaic power was linear. The authors in [11] presented satellite images and a support vector machine (SVM) model to predict the solar irradiance and cloud movement. The work in [12] presented an auto-regression model to forecast the output power from solar panels for up to 36 h in Denmark. In [13], the ANN tool was used to predict the global solar irradiance from a set of inputs obtained from meteorological stations. The ANN provided a Root Mean Square Error (RMSE) of less than 20%. The authors in [14] presented a regression model based on sigmoid function to estimate the hourly diffuse solar irradiation under all weather conditions. However, the clearness index and relative optical mass were considered as the predictors. The model provided a relative RMSE in the range of 25-35%. The work in [15] presented a model based on ANN to estimate the output power for different time horizons in different seasons in Turkey. A 750 W PV panel was installed. The obtained RMSE for each time and season was recorded. In [16], four different forecasting techniques, including ANN, were developed to estimate the PV panels' output power in California. One MW solar panel field was used to forecast 1 h and 2 h ahead. The ANN offered the best performance compared to the other forecasting techniques. The presented model offered a RMSE of 15% compared to the other models, which provided an error of up to 20%. However, this research did not consider the different environmental conditions affecting the output power of the system. In [17], an experiment was conducted in Spain to study the impact of rainfall in cleaning photovoltaic modules. The study showed that the reduction in energy was 20% in solar photovoltaic power plants in periods without rain compared to only 4.4% energy reduction in rainy periods due to dust accumulation. In [18], another study was implemented for 4 months in Morocco to estimate the dust accumulation through the presence of a correlated input of output power of solar photovoltaic panel and rainfall. The rainfall data were obtained from a meteorological data center. In [19], high rate of dust, low frequency and intensity of rain were considered to study the effect of dust accumulation on the performance of PV panels in the Jazan Region. It was found that the regular dust accumulation reduced the PV efficiency by 10%. However, lower tilt angles caused higher dust accumulation than higher tilt angles. The authors in [20] presented an image processing-based system to estimate the amount of dust accumulated on the surface of PV panels, where small plasticised paper was used as indicator to make image processing. Most of the existing research focused on the effect of dust accumulation on PV panels. However, only a few studies focused on estimating the amount of dust accumulation to initiate the suitable cleaning actions.

Problem Statement and Proposed Methodology
Accurate prediction of the accumulated dust on the solar panels is a vital issue for investors and grid operators. The objective of this research is to develop a Dust Estimation Unit to estimate the amount of accumulated dust on solar panels in the UAE from measured photovoltaic output power, solar irradiance, and temperature, as shown in Figure 2. Hence, cleaning procedures can be initiated if the dust accumulation is high. Machine learning is utilized to develop this detector. The first and vital step is to collect all the required data, so that a machine learning model can learn the behavior of PV panels. Then, the collected data is utilized to train different regression models and study the accuracy of each model. Finally, the proposed unit is assessed for its accuracy throughout other cases studies with various dust levels. The details for each previous step will be explained in the following subsections.

Data Preparation
Data preparation is the first and vital step in the model, as shown in Figure 2, where these data are fed to the regression models to learn how to predict the dust level. Regression is a type of machine learning; it is commonly used in forecasting and estimation problems [21]. Further, it is a statistical approach that is utilized to predict the relationship between one or multiple input variables called predictors and a single output variable called the response variable. Data fed to this model should be arranged in a matrix that begins with the predictors and ends with the response variable. In this research, three variables will be used as the predictors to predict a single output or response variable. The three predictors are solar irradiance, ambient temperature and the output power from PV at these

Data Preparation
Data preparation is the first and vital step in the model, as shown in Figure 2, where these data are fed to the regression models to learn how to predict the dust level. Regression is a type of machine learning; it is commonly used in forecasting and estimation problems [21]. Further, it is a statistical approach that is utilized to predict the relationship between one or multiple input variables called predictors and a single output variable called the response variable. Data fed to this model should be arranged in a matrix that begins with the predictors and ends with the response variable. In this research, three variables will be used as the predictors to predict a single output or response variable. The three predictors are solar irradiance, ambient temperature and the output power from PV at these conditions, while the dust level is the response variable. In order to create the required data Energies 2020, 13, 3601 6 of 16 set that includes these readings, experiments that will be explained later were conducted at different times of the day and for multiple days to collect a wide range of data. The output power of the solar photovoltaic panel was measured with different levels of natural dust while recording the ambient temperature and solar irradiance. A total of 4800 data points were collected and these are divided into two sets. One set will be utilized in the training stage while the other set will be used in the testing stage.

Model Training
The second stage is to train the model using the experimental collected data to make the model learn how to predict the dust level with knowledge of the PV panels' output power, the ambient temperature and the solar irradiance. The collected data (predictors, response) are divided into two sets. One data set is used to develop the model in the training stage and another set is used to validate the accuracy of the model in the testing stage. The temperature, solar irradiance and the output power from PV panels' as three predictors as well as the dust level as the response variable are fed to the regression models. The regression models try to predict the dust level and compare the predicted response with the actual response. The regression models utilize 5-fold Cross-Validation in the model training. It is widely utilized in the regression models to evaluate and estimate the misclassification error. In this technique, the data used in the training stage is randomly divided into five groups. Then, one group will be used as testing data and the remaining groups will be used as training data. Therefore, the model will be trained using the four groups and evaluated using the one test group. This procedure is repeated various times with different groups to ensure that each group has been used once to test the model. After the training stage, the error between the predicted and actual response (dust level) is determined. Various regression models are utilized in the training stage. Each model can estimate the dust accumulation on solar panels with a different RMSE value. The best regression prediction model, which provides the most accurate dust estimation with minimum RMSE, is selected. After selecting the best model that predicts the dust accumulation with the least error, the model can be tested on different case studies.

Regression Models
Different models are used in the training stage to decide the best model for the proposed dust estimation unit. The regression models presented in this work are linear regression, SVM, and Decision Tree (DT).

Linear Regression
The linear regression tries to find the best possible fit between the predictors and the target. The function that represents the estimated dust level as a function of three predictors can be expressed as follows [22]: whereŷ is the predicted dust level; x = [x 1 x 2 x 3 ] T is a vector containing the temperature, irradiance, and output power from PV as the three predictors at these conditions; β represents the linear model coefficients to be determined. The linear model coefficients are determined such that the least square error between the actual dust levels and the predicted dust levels is minimized, as written in the following equation where y is the actual dust level.

SVM Regression
In SVM regression, a proper line or hyperplane to fit all data will be obtained by SVM. The difference between the objective function in case of Linear regression and SVM regression is that the coefficients are determined to minimize the squared coefficients, not the squared error. The predicted dust levels, the objective function, and the error constraint can be written using Equations (1), (3), and (4), respectively [23].
subject to: A slack variable, ε n , will be defined in case no function exists to satisfy the previous constraints. Then, the equations can be written as follows: subject to: where N is the number of experimental data points and C is a regularization parameter.

DT
Dt can solve problems related to both the classification and regression. The prediction space can be split into non-overlapping regions, as shown in Figure 3. Various techniques are utilized to construct the DT and calculate the number of regions like classification and regression tree (CART), and Iterative Dichotomiser 3 (ID3). These regions are called leaves, if they are not further subdivided into other regions, otherwise they are called nodes. For each region, the predicted dust level can be considered as the average of all values of dust level points utilized in the training in this region, as illustrated in Figure 3. We are only showing two predictors for the sake of simplicity of visualization, but we applied the DT model using all three predictors. The DT tries to find which predictor among all predictors is the most predictive of the response variables. The DT will use this predictor to start the tree as tree root. However, DT will ask a question related to one predictor at each node to determine the motion's direction. This procedure will continue until reaching the end of the tree [24].

Dust Estimation Unit
After selecting the regression model that investigates the least RMSE, this model is extracted. Then, it can be fed by the three predictors, which are the ambient temperature, solar irradiance, and the PV panels' output power, and it would predict the accumulated dust level on the solar panel. Accurate prediction of dust is essential for the system operators to estimate the most optimal time to

Dust Estimation Unit
After selecting the regression model that investigates the least RMSE, this model is extracted. Then, it can be fed by the three predictors, which are the ambient temperature, solar irradiance, and the PV panels' output power, and it would predict the accumulated dust level on the solar panel. Accurate prediction of dust is essential for the system operators to estimate the most optimal time to clean solar photovoltaic modules. A threshold value can be selected for the accumulated dust level on the solar panel, which results in a significant degradation of the energy produced by solar PV systems. Hence, cleaning procedures can be initiated once the dust accumulation level exceeds this threshold value. Utilizing the most optimal time for cleaning solar photovoltaic modules reduces unnecessary costs of continuous cleaning of solar photovoltaic modules. The final stage is testing the dust estimation unit using a data set that the model has not seen before to avoid any bias and accurately evaluate the performance of the model. The experimental data points curtailed from the original experimental data set and not used in the training stage will be used in the testing of the proposed unit.

Experimental Setup
A wide range of data is essential to be utilized in training and testing the model, as mentioned before. Experiments were conducted outdoors with real environmental conditions on a solar photovoltaic panel. The effect of dust on the performance of the solar photovoltaic panels has been studied under these environmental conditions. The experimental setup, consisting of a 400 Watt solar photovoltaic panel, was implemented in Dubai (at latitude 25 • 27 42.52 N and longitude 55 • 40 44.06 E). The specifications of the solar panel used in the experiment are shown in Table 1. Dust used in the experiment was collected from the same location as the experiment. The location of Dubai city was selected for the experiment as Dubai city has the largest single-site solar park in the world. The output power of the solar PV panel was measured at different times of the day under various environmental conditions of solar irradiance, temperature, and dust levels. The irradiance was measured by using the RS PRO Solar Power Meter ISM400. The best accuracy the meter can provide is ± 5 W/m 2 , with a resolution of 0.1 W/m 2 . The maximum irradiance that can be detected by this meter is 2000 W/m 2 [25]. A high precision electronic scale with a precision of 50/0.001 g and an error range of ± 0.003 g was used to weight the amounts of dust used in the experiments. Then, the dust was evenly distributed on the solar PV panel using a thin brush. A dust amount of 0.1 g/m 2 was evenly distributed on the panel. Then, the ambient temperature, solar irradiance, and the output power from the solar panel were measured. The same experiment was repeated for dust amounts of 0.2 g/m 2 up to 0.9 g/m 2 with incremental steps of 0.1 g/m 2 . Hence, for each instant of time, 10 data points were collected; 1 point with no dust and 9 points with various dust levels. The experiment was conducted at different Energies 2020, 13, 3601 9 of 16 times of the day for a period of 2 months (September and October). After conducting the experiment, 4800 data points were collected to be fed to the regression model for training and testing the proposed dust estimation unit. The data set was divided randomly into 4000 data points and 800 data points. 4000 data points of the experimental data will be utilized in the training stage and divided using 5-Fold Cross Validation into 4 data groups for training and 1 data group for testing, as mentioned before; whereas the remaining 800 data points will be used in the testing stage to evaluate the potential of the proposed dust estimation unit. However, the training set has more points than the testing set to ensure that the model is more general and trained against different conditions.

Results and Discussions
In this section, the results from different scenarios are discussed to evaluate the performance of the proposed dust estimation unit. Experiments were conducted under real environmental conditions to collect enough data to be utilized in the training and testing of different regression models to predict the accumulated dust levels on solar photovoltaic systems. The collected data are divided into two parts, one for training that contains 4000 data points (to choose and train the appropriate model) and another part for testing that contains 800 data points. A sample of the experimental collected data under real environmental conditions is shown in Table 2. The experiments were implemented in different time periods and on different days to construct a database with various ranges of environmental conditions. The measured irradiance values through experiments were between 77.7 W/m 2 and 650.9 W/m 2 , while the temperature values were between 21.8 • C and 45 • C. The output power of the solar photovoltaic panel was measured and recorded with dust levels between 0 g/m 2 to 0.9 g/m 2 with a step of 0.1 g/m 2 . The measured output power was between 2.6 W and 313 W with different dust levels, different solar irradiance and ambient temperatures. The regression models will be trained to predict the response variable based on input predictors, and hence predict dust accumulation levels based on the experimental measured values of temperature, solar irradiance, and output power from the solar panel. All 4000 collected data points from the field experiment used in the training stage are rearranged based on the dust weight, as illustrated in Figure 4, i.e., the first part of the data points from 0 to 399 consists of different combinations of output PV power, temperature, and solar irradiance, however, all these points are measured with 0 dust level. Various regression models are trained using these data. The first regression model used to estimate the accumulated dust level on the solar photovoltaic panel is the linear regression since the relation between dust accumulation and solar photovoltaic output power in the UAE is linear according to [10,26]. Four linear regression models are developed and utilized for dust estimation. The RMSE of linear regression models is high, as shown in Table 3, as the linear regression model is suitable for estimating dust levels with amounts less than 0.34 g/m 2 under controlled indoor environmental conditions [10]. However, for high amounts of dust accumulation under real environmental conditions, non-linear regression models must be investigated. Table 3 shows samples of the various regression models used in the dust estimation unit training and the corresponding RMSE. Among all used nonlinear regression models, Fine Tree Regression provides the best dust level estimation. The developed fine tree regression model provides an accurate dust prediction with RMSE = 0.026737 g/m 2 . This error is 53.13% lower compared to the lowest error obtained by the other regression models (Exponential GPR = 0.057048). Figure 5 shows the original data set and the predicted dust level using the fine tree algorithm. Figure 5 illustrates the line response of fine tree regression model. The majority of the predicted dust accumulation points by the fine tree model are located around the horizontal line, which represents a zero or very close to zero error between the actual dust levels and the predicted dust levels except for a few measured points, as shown in Figure 6. Moreover, the maximum error obtained for dust levels less than 0.6 g/m 2 is 0.35 g/m 2 . However, for dust levels more than 0.6 g/m 2 , the maximum error in prediction is 0.2 g/m 2 . Therefore, the model is reliable for estimating high dust accumulation, which is more vital in solar power plants.

Case 1: Premeasured Dust Levels
Two case studies are conducted to assess the performance of dust estimation unit based on the fine tree regression model. The first case study is conducted using premeasured dust levels for 800 data points curtailed from the original data set. The amount of dust used in this case study is between 0 g/m 2 and 0.9 g/m 2 with an incremental step of 0.1 g/m 2 . The three predictors, which are the ambient temperature, solar irradiance and PV panels' output power, are fed to the dust estimation unit to predict the accumulated dust level on the solar panel. These values are compared to the original dust levels obtained from the experiment and hence, the error is determined, as shown in Figure 7. Table  4 illustrates samples for the experimental dataset utilized in case 1, in addition to the predicted dust level by the proposed unit. The error between the actual and predicted dust levels for small amounts of accumulated dust on solar panels is higher than the error in case of large amounts of accumulated dust on solar panels. Moreover, the highest error between the measured dust level and the estimated dust level is 0.3 g/m 2 . Hence, the model provides accurate dust estimation, particularly with high dust accumulation, which is vital for the operation of solar plants.

Case 1: Premeasured Dust Levels
Two case studies are conducted to assess the performance of dust estimation unit based on the fine tree regression model. The first case study is conducted using premeasured dust levels for 800 data points curtailed from the original data set. The amount of dust used in this case study is between 0 g/m 2 and 0.9 g/m 2 with an incremental step of 0.1 g/m 2 . The three predictors, which are the ambient temperature, solar irradiance and PV panels' output power, are fed to the dust estimation unit to predict the accumulated dust level on the solar panel. These values are compared to the original dust levels obtained from the experiment and hence, the error is determined, as shown in Figure 7. Table 4 illustrates samples for the experimental dataset utilized in case 1, in addition to the predicted dust level by the proposed unit. The error between the actual and predicted dust levels for small amounts of accumulated dust on solar panels is higher than the error in case of large amounts of accumulated dust on solar panels. Moreover, the highest error between the measured dust level and the estimated dust level is 0.3 g/m 2 . Hence, the model provides accurate dust estimation, particularly with high dust accumulation, which is vital for the operation of solar plants.

Case 2: Random Dust Levels
Another case study is conducted to evaluate the accuracy of the proposed dust estimation unit based on the fine tree regression model. Experiments were performed again to measure the output power of solar panels with different solar irradiance, ambient temperatures, and random dust accumulation amounts. In this case, random dust amounts were spread uniformly on the solar photovoltaic panel. The output power of the solar photovoltaic panel, the solar irradiance, and the ambient temperature were measured at each instant. A total of 35 different data points were collected. The experiments were conducted outdoors under real environmental conditions. A sample of the collected data from experiments is illustrated in Table 5. The collected data are fed into the proposed dust estimation unit, based on fine tree regression model, to estimate the dust levels on the solar photovoltaic panel. The inputs to the proposed unit are the three predictors: ambient temperature, solar irradiance, and output solar photovoltaic power. The proposed model predicts the accumulated dust levels on the solar photovoltaic panel. The predicted and measured dust levels are shown in Figure 8. For high dust levels, the error is between 0.02 g/m 2 and 0.01 g/m 2 . For low amounts of dust accumulation, the range of error is between of 0.06 g/m 2 and 0.01 g/m 2 . Estimating high amounts of dust is more important for its impact on solar photovoltaic output power compared to the effect of low dust accumulation. The results reveal the ability of the proposed dust estimation unit to detect the dust level with low error.

Case 2: Random Dust Levels
Another case study is conducted to evaluate the accuracy of the proposed dust estimation unit based on the fine tree regression model. Experiments were performed again to measure the output power of solar panels with different solar irradiance, ambient temperatures, and random dust accumulation amounts. In this case, random dust amounts were spread uniformly on the solar photovoltaic panel. The output power of the solar photovoltaic panel, the solar irradiance, and the ambient temperature were measured at each instant. A total of 35 different data points were collected. The experiments were conducted outdoors under real environmental conditions. A sample of the collected data from experiments is illustrated in Table 5. The collected data are fed into the proposed dust estimation unit, based on fine tree regression model, to estimate the dust levels on the solar photovoltaic panel. The inputs to the proposed unit are the three predictors: ambient temperature, solar irradiance, and output solar photovoltaic power. The proposed model predicts the accumulated dust levels on the solar photovoltaic panel. The predicted and measured dust levels are shown in Figure 8. For high dust levels, the error is between 0.02 g/m 2 and 0.01 g/m 2 . For low amounts of dust accumulation, the range of error is between of 0.06 g/m 2 and 0.01 g/m 2 . Estimating high amounts of dust is more important for its impact on solar photovoltaic output power compared to the effect of low dust accumulation. The results reveal the ability of the proposed dust estimation unit to detect the dust level with low error.

Performance Evaluation of the Proposed System
To assess the performance of the proposed dust estimation unit, a 0.6 g/m 2 dust level was selected as the threshold value or decisive value to initiate the cleaning actions. A level of 0.6 g/m 2 was selected to represent high accumulation level, where the power is degraded by more than 60% at this dust level compared to the zero-dust case at the same environmental conditions. Fifty data points with dust levels less than 0.6 g/m 2 , and 50 data points with dust levels higher than or equal 0.6 g/m 2 collected from experiments, are fed to the proposed unit to evaluate its performance. The proposed unit will determine the need for cleaning actions according to the predicted dust level compared to the threshold value. The following performance parameters are determined to illustrate the potential of the proposed dust estimation unit.
where, TP is the number of true positives, which means there a is need for cleaning and the proposed

Performance Evaluation of the Proposed System
To assess the performance of the proposed dust estimation unit, a 0.6 g/m 2 dust level was selected as the threshold value or decisive value to initiate the cleaning actions. A level of 0.6 g/m 2 was selected to represent high accumulation level, where the power is degraded by more than 60% at this dust level compared to the zero-dust case at the same environmental conditions. Fifty data points with dust levels less than 0.6 g/m 2 , and 50 data points with dust levels higher than or equal 0.6 g/m 2 collected from experiments, are fed to the proposed unit to evaluate its performance. The proposed unit will determine the need for cleaning actions according to the predicted dust level compared to the threshold value. The following performance parameters are determined to illustrate the potential of the proposed dust estimation unit.
Sensitivity (Detection Rate) = TP TP + FN Speci f icity = TN TN + FP (8) Negative Predictive Value (NPV) = TN TN + FN (10) False Alarm (FA) = 1 − Speci f icity (12) where, TP is the number of true positives, which means there a is need for cleaning and the proposed unit decides the starting of cleaning actions; FN is the number of false negatives, which means there is a need for cleaning and the proposed unit decides no cleaning actions; TN is the number of true negatives, which means there is no need for cleaning and the proposed unit decides no cleaning actions; and FP is number of false positives, which means there is no need for cleaning and the proposed unit decides the starting of cleaning actions. The performance of the proposed dust estimation unit based on fine tree regression model is compared with another detector based on ANN to demonstrate the potential of the proposed unit. We used an ANN as it is a popular prediction model in many approaches. We used a three layer-feed-forward network to predict the relationship between the target variable, dust level, and the three predictors; temperature, irradiance and the output power. There are many variants of the backpropagation algorithm to train neural networks; the Bayesian Regularization algorithm is one such method. Although it takes more time, it is characterized by good generalization for difficult, small or noisy datasets. We chose the other parameters of the network (number of hidden nodes, etc.), using a validation set. The detector based on ANN provides an accurate estimation with RMSE equal to 0.0592. Table 6 illustrates the value of the performance parameters for both detectors. The results reveal the potential of the proposed unit to initiate the cleaning actions only when required, with a low false alarm rate.

Conclusions
This paper presents a dust estimation unit to estimate the dust accumulation levels on solar panels and, hence, initiate the cleaning actions. First, experiments were conducted under real environmental conditions in order to investigate the effect of dust accumulation on solar photovoltaic panels and to collect the dataset required to develop the proposed dust estimation unit. The output power of the solar PV panel was measured at different times of the day under various real environmental conditions of solar irradiance, temperature, and dust levels through the conducted experiments. Second, the collected data from the experiments are used in training different regression models. The model with the least RMSE is utilized in developing the dust estimation unit. The regression model based on the fine tree algorithm provides the least RMSE. Therefore, the dust estimation unit is developed based on the fine tree algorithm. A threshold value is selected and once the dust level exceeds this value, which leads to high degradation in the output power from solar panels, the cleaning procedure is initiated. Various case studies are performed to assess the accuracy of the proposed model to predict the amount of dust accumulation on solar panels. A comparison between the performance of the proposed unit and another unit based on ANN is presented. The results reveal that the proposed fine tree dust estimation unit is able to predict the dust level with a low false alarm rate and high performance. Analysis of economic yield derived from the obtained results in this paper will be investigated as a core of a new paper.