1. Introduction
Green renewable energy such as solar energy is important to solve many of the environmental problems caused by our toxic practices, such as coal mining, fracking, and nuclear fusion [
1]. Solar has become one of the fastest-growing renewable energy sources. It provides a good solution to the issue of our diminishing limited resources. It is collected from our most abundant resource, the sun, which means that solar will be a sustainable option for energy as long as the sun exists. With decreasing prices, increasing demand, and environmental benefits, solar energy has become a viable alternative to traditional forms of energy. According to the Solar Energy Industry Association (SEIA), the cost to install solar has dropped by more than 70% over the last decade, leading the industry to expand into new markets and deploy thousands of systems nationwide [
2]. Many federal and state-level incentives are available for residential houses that wish to switch to solar energy. For example, the federal residential solar energy credit is a tax that gives consumers an incentive by deducting a percentage of the cost of a solar photovoltaic (PV) system from the federal taxes. This tax deduction is based on dollar-for-dollar amount, which means if a user claims a USD 1000 federal tax credit, it will cut down their federal taxes by USD 1000. On the state level, many states have their own incentive program. For example, Illinois State (IL) is committed to producing 25 percent of its electricity from renewable resources by 2025. To make this ambitious goal happen, the state created an incentive program called the Illinois Shines to support the development of new solar energy generation in the state. The Illinois Shines program offers many incentives to help middle-class families go solar. As a result, now is the best time for a residential house owner to switch to solar energy.
Solar panel location assessment is usually a laborious process. The need to automate the process of solar panel location assessment is very important due to its potential application for prioritizing buildings based on the likelihood of obtaining benefits from switching to solar energy. Many criteria should be taken into consideration before deciding to switch to solar energy. For example, the surrounding environment, roof surface area, and slope could affect the house’s solar potential [
3]. Modeling the built environment and the building location is essential in evaluating a building’s potential for solar rooftop photovoltaics panel (PV) energy generation.
Advances in convolutional neural network (CNN) approaches have facilitated automated information extraction techniques from image data. Deep learning, a branch of the wider machine learning research field, is a modern approach to data classification, which has shown recent advancements in unsupervised image classification, in some cases producing results with higher accuracy than humans [
4].
In this paper, Photovoltaic system location automatic assessment has been conducted to fill a gap in the available information about the potential of such systems. This research presents a CNN-based model that could automatically assess a building’s location potential to harvest solar energy
2. Literature Review
Many researchers used convolutional neural network (CNN) to optimize energy consumption. Abadi et al. (2016) developed a CNN model to determine daylight illuminance for office buildings. A three-layer CNN model of feed-forward type was constructed with one output variable (illuminance) and two-time variables, five weather determinants, and six building parameters as input variables. Sensitivity analysis was performed on this constructed model to determine the effect of each input variable on the output variable. From the tests, the authors concluded that the utility of this model is to depict the satisfactory predictions of daylight illuminances, and it is a less time-consuming process to provide feedback information for existing buildings [
5].
Jeffrey Ignatius Kindangen indicates that neural networks can be used to predict openings such as windows to study the effects on interior air motion. As with all other authors, Kindangen used a similar model with input, hidden, and output layers to predict the window size and location effects on interior air motion. With a total of 11 parameters considered, the input was the wind direction and openings of the building, and the output was the velocity coefficient. After comparing numerical simulation results versus CNN results, CNN results were reliable by proving that neural networks are better to predict opening configuration effects on interior air motion. The author mentions that further study is needed to predict more architectural parameters [
6].
Mubiru and Banda, 2007, conducted a study to explore the possibility of developing a prediction model using convolutional neural networks (CNN) to estimate the monthly average daily global solar irradiation on a horizontal surface by taking in latitudinal, longitudinal, and altitude parameters. A CNN model was created, where the output was fixed with linear transfer functions, and hidden layers were tangent sigmoid and log sigmoid functions. These CNN model gave the least mean absolute percentage error values, making it superior to the empirical model, as it can reliably capture the nonlinear nature of solar radiation [
7].
Tam et al. (2003) used CNN to model to investigate and analyze the relationship between key storage areas and the tower crane. A genetic algorithm (GA) is used to determine the locations of the tower cranes by analyzing the transportation times and costs. The authors evaluated the GA-CNN model using practical examples. They used a multilayer feed-forward network with input, output, and hidden layers. More than one hidden layer resulted in more than one output layer with different views of the data for better prediction. These CNN prediction models were applied to construct a GA model for optimizing the locations of a generic tower crane and supply points. The results shown of the example were very promising and demonstrated the application value of the models [
8].
Rehman et al. conducted a study to estimate the global solar radiation (GSR) using air temperature and relative humidity by using the CNN method. Three feed-forward CNNs have been trained; the first had only the day of the year and daily maximum temperature as inputs and GSR as output. In the second, the inputs were day of the year and daily mean value temperature, whereas output was GSR. In the last one, the GSR was predicted using day of the year and daily average temperature and relative humidity values. After the tests are done, the authors concluded that neural networks can estimate GSR from temperature and relative humidity. The third case outperforms the other two with a mean percentage error of 4.49%. This can predict the GSR for the locations; however, the only limitation is that its usage is dependent on the availability of data such as temperature and humidity. The author suggested that meteorological parameters should be considered in future studies for diffused and normal incident solar radiation on horizontal surfaces [
9].
Maja et al. (2007) determined the location in the indoor environment based on received signal strength (RSS) fingerprinting using CNN. A multi-layer-perception feed-forward CNN with three inputs of RSS from three different access points and two outputs representing the two-dimensional location (x,y) with a single hidden layer was used. From the results, the authors used CNN as a pattern-matching algorithm for the proposed system, which was an advantage of flexible modeling and generalization capabilities that perform well for unknown test data. There were no convergence and stability problems, as the training was off-line. The accuracy in determining location has been promising, with an accuracy error of 1.79 m for the proposed system [
10].
Chen et al. (2003) conducted a study using CNN to predict the pressure coefficients on roofs of low buildings. Feed-forward three-layer networks using back propagation training algorithm were employed, which involves a large number of variables such as wind direction, roof height, and normalized roof coordinates to predict pressure statistics with good accuracy for any used combination of the variables mentioned. The CNN model was trained with wind tunnel experimental data from generic buildings of similar plan dimensions but varying roof heights. The results show that CNN is capable of generalizing functional relationships with large number of variables. The authors concluded that the CNN approach could be used to expand aerodynamic databases to a larger variety of geometrics and increase its practical feasibility [
11].
Vincenzo and Infield (2010) conducted a study to predict the output power from a photovoltaic array in the case of partial shading using CNN. Using a two-layer feed-forward network, in the first test, the inputs used were selected from simulation results, and five hidden neurons were needed to have precise results, whereas for the second test, the input was same as the first, but there were only four neurons in the hidden layer. From the results, the first network had a good approximation but failed in the case of partial shading. The second network worked well for both partial shading and uniform radiation, even though it fails in case of fast irradiance changes due to delay between the measurements [
12].
Mekki et al. (2016) used CNN to assess the output photovoltaic current and voltage under somewhat shaded conditions. The authors measured the data of a PV module installed on the rooftop and the experimental setup was run under real conditions for several days to record the meteorological data and corresponding electrical output. The CNN model used was a multilayer perception one with inputs such as temperature and irradiance, whereas the output was current and voltage. Several shading configurations were investigated to check the efficiency of the proposed method. The results show that the designed method can accurately estimate the output and detect any decrease in the output power. The authors also concluded that the designed CNN model does not require any complex calculations and mathematical models, though it should be trained for good accuracy, and the proposed method can be implemented into a low-cost microcontroller for real-time applications. Future study was recommended for large scale PV plants [
13].
Sun et al. (2018) used convolutional neural networks (CNN) to relate PV output to contemporaneous images of the sky (a “now-cast”). The CNN achieves test-set relative-root-mean-square error values (rRMSE) of 26.0% to 30.2% when applied to power outputs from two solar PV systems. The study investigates the sensitivity of model to a range of CNN structures, with different widths, depths, and input image resolutions [
14].
House et al. (2018) conducted research to propose a method of automating lead generation by applying deep neural networks (DNNs). A semantic segmentation network (SegNet) is utilized, with a database of satellite images and corresponding pixel label images. The SegNet is used to identify whether buildings have pre-existing solar installations from satellite imagery, using a cascaded convolutional neural network (CNN). Transfer learning on the CNN is used to classify roofs of buildings into two categories: having solar PV installed and not having solar PV installed. [
15].
Huang et al. (2019) suggested a deep neural network model named PVPNet to forecast PV system output power. The model takes meteorological information, such as temperature, solar radiation, and output data of historical PV systems, as inputs and generates 24 h probabilistic and deterministic extrapolation of PV power. Mean absolute error (MAE) and mean square error (RMSE) is utilized for calculating accuracy of forecasting [
16].
Li et al. (2020) proposed the use of convolutional neural network (CNN) and multilayer perceptron (MLP) neural network models to predict the IV curve of photovoltaic modules. The results of the study indicated that the prediction accurateness of the CNN and MLP neural network model is substantially finer in comparison to the conventional equivalent circuit models [
17].
Nie et al. (2020) came up with proposal of a two-stage classification–prediction model using two different classifiers to predict current PV power output from sky images (a so-called “nowcast”) and compare it with an end-to-end convolution neural network (CNN). The first step is classification, in which input images are classified based on different sky conditions using a CNN-based classifier trained on clear sky index (CSI)-labeled sky images. In the second step the classified images are taken as input for sub-model to accurately predict PV-power output using a physics-based non-parametric classifier based on a threshold of fractional cloudiness of sky images. Additionally, different numbers of classification categories are also considered and examined. It could be inferred from the result that cloud-based classifiers are more effective that CSI-based classifiers for the framework. The three-class classification (i.e., sunny, cloudy, overcast) is the pragmatic choice [
18].
Malof et al. (2017) developed CNN-based algorithms that automatically identify small-scale solar photovoltaic arrays in high-resolution aerial images. With these algorithms, collecting small-scale information of photovoltaic (PV) such as their location and the energy production can be the quickest and inexpensive solution. The result indicated that, in comparison to other previous results, the CNN gave promising results and improved performance [
19].
Aurangzeb et al. (2019) proposed an effective framework by applying a multiheaded convolutional neural network (CNN) model to accurately predict energy produced by renewable energy resources (RERs). They incorporated an energy storage system (ESS) and RERs with smart homes. After simulation, the results were promising, and with the proposed framework consumers’ energy bills were drastically reduced [
20].
Millet et al. (2010) proposed a multilayer perceptron MLP-model based on an artificial neural network (ANN) to forecast the solar irradiance with reference to grid-connected photovoltaic plants (GCPV). The researchers based their proposal on 24 h, using the contemporaneous values of the mean daily solar irradiance and air temperature. An experimental dataset was used for this purpose, which was gathered from Trieste, Italy.
K-fold cross-validation was carried out to test the results. The results shows that the proposed model has high degree of accuracy, keeping the sunny days correlation coefficient in the range 98–99% and the cloudy days correlation coefficient in the range 94–96% [
21].
Mijanur et al. (2021) proposed an ANN approach, which has a high degree of accurateness in prediction of renewable energy. Various models of neural networks such as “multi-layer perception” (MLP), “recurrent-neural network” (RNN), and “convolutional-neural network” (CNN), as well as “long-short-term memory” (LSTM) models have been considered for the refinement of predicted results. These models use prior information and influence the future prediction. The results indicated optimal performance for short-term time series prediction [
22].
Ahmad et al. (2020) proposed a framework of forecasting using three classifiers such as: (i) machine learning algorithms; (ii) ensemble-based approaches; (iii) and artificial neural networks, taking data from wind, solar, and geothermal energy. The forecasting intervals are further subdivided into (i) short-term; (ii) medium-term; (iii) and long-term. Using these different models can help the researchers and professionals choose a relevant model with respect to their forecasting requirement and desired task [
23].
Vanting et al. (2021) proposed a multivariable framework for forecasting, a hybrid model consisting of both convolutional and recurrent neural networks. The combination of both models signifies that the model learns as many features and patterns in the data as possible. Parameters such as historical consumption, weather, and day features are taken as input variable for the model. The proposed model can be applicable in both demand and supply side management for accurate forecasting [
24].
Agarwal (2021) proposed that aggregated one-dimensional convolutional neural networks can be successfully modified to predict household consumption one week ahead with greater accuracy than a basic one-dimensional convolutional neural network model or a classical auto regressive integrated moving average model. The proposed aggregated convolutional neural network model was tested with 4 years’ historical consumption dataset of a household. Once the results were analyzed, it was concluded that the proposed model gave very promising root mean square error reduction [
25].
Yan et al. (2020) proposed a framework for Chinese to predict energy structure of China using an integrated convolution neural network (CNN) with long short-term memory (LSTM). Historical data were the input for CNN and then encoded by the layers LSTM. To check the efficacy of the model, results were compared with previous years data from 1965 to 2018. After analysis, the results were not only highly superior but also accurately predicted the information for the next decade [
26].
Agga et al. (2021) proposed a hybrid model that can effectively predict the power consumption of self PV plant having. The hybrid model consists of CNN-LSTM and ConvLSTM. An LSTM model was used as reference for checking the efficiency of the model. The model was trained on two separate datasets: (1) a univariate data set, which contained power output of previous days, and (2) a multivariate dataset with weather features that impact the production of the PV plant. The results indicated that the proposed approach has a high degree of accuracy compared to a standard LSTM model [
27].
Rai et al. (2020) proposed a hybrid model for accurate prediction of effective midterm solar radiation. This model consists of a convolution neural network (CNN) and bi-direction long short-term memory (BiLSTM). In the first step, features of solar radiations are captured by CNN; in the second step, the BiLSTM uses time series data. The proposed model is tested on data of three separate locations and compared with other deep learning models. Types of distribution error such as skew and kurtosis are also considered in assessing the distribution of predicted solar radiation. The results show that the proposed model is more accurate than other recently proposed deep learning models [
28].
Rajugukguk et al. (2020) proposed a hybrid model based on recurrent neural network (RNN), long short-term memory (LSTM), gated recurrent unit (GRU), and convolutional neural network-LSTM (CNN–LSTM). These models were selected based on their optimum performance in regard to the input data, accuracy, forecasting horizon, type of weather, and training time. After analysis, it was revealed that this model has pros and cons, but it generated significant results in terms of root-mean-square error evaluation metric (RMSE). The author suggests consideration of RMSE as an evaluation metric comparing the accuracy between different studies [
29].
Zhang et al. (2018) trained a deep learning model, which learns the relationship between sky appearance and future photovoltaic power output. Several models such as CNN, LSTM, and MLP were trained on datasets gathered in Kyoto, Japan. The model takes previous photovoltaic power values and aerial imagery as input and predict photovoltaic power in a short-term future. The results indicated that an LSTM-based model that can learn temporal features outperforms the rest of the deep learning models, achieving an RMSE skill score of 21% [
30].
Table 1 summarizes the literature review.
The literature review shows that the application of CNN to automatically assess building location fitness for solar panels is still underdeveloped. This research is an attempt to cover the gap in this area.
3. Model Design and Methodology
The following is the proposed model for predicting object detection by passing an image once through a convolutional neural network. The presented model could identify building location fitness for the installation of solar panels. The model is bifurcated into five stages:
3.1. Stage One: Selection the Appropriate Media for Collecting Data
Stage one consists of selecting the appropriate media for collecting data. Several options to consider for digital media collection such as drones and satellite images. Additionally, selecting from single-frame images as a starting point for data labeling or starting with a video can be preferred based on unique challenges presented by the region of interest (ROI). Although the presented algorithm has the capabilities to be trained using single images or videos, the researchers will focus on using drones to capture videos.
3.2. Stage Two: Data Capturing and Image Annotation
To collect data, the researchers suggest flying a drone over the area of interest for 30–60 min to capture a high-resolution video of the buildings under study. MATLAB’s video labeling tool, which is available in MATLAB Imagining Toolbox, will be used to extract images from the video. Once the drone captures the video, the next step is to import the video into the video labeling tool to create the labels for the classification. By utilizing the point tracking algorithm that tracks points in the matrix, the algorithm tracks and saves the ground truths in relation to each video frame. Once the entire video has been annotated, the next step is to extract each frame with the corresponding labels; then, the location of the labels and ground truths will need to be extracted into a text file with the same name as the corresponding image frame name.
3.3. Stage Three: Transfer Learning
The method of solving one problem through stored knowledge while applying it to another problem is referred to as transfer learning and will represent a core component of the current study. In this stage, transfer learning on the CNN is used to classify roofs of buildings into two categories of shaded and not shaded. You only look once (YOLO) has been already trained as a simple classifier to predict whether an image contains a dog or cat; the researchers are using the current knowledge of the model gained during its initial training to recognize other objects such as shaded roofs. In order to do so, the researchers created the training data and now are prepared to begin transfer learning, starting with the batch size, which is the number of training examples utilized in one iteration. In this methodology, the images will be sent in batches of 32 through a convolutional neural network (CNN), and once the model iterates through the entire dataset, the system will account for one epoch. The number of epochs will be contingent on how many times the CNN requires to see the entire training dataset. The epochs should run until the model becomes overfit or the loss drops below 5%.
3.4. Stage Four: Model Validation
Stage four is where the model prediction is tested to see sufficient prediction has been obtained. Model performance is measured using the trained model to identify shaded roofs vs. human prediction on identifying shaded roofs. The predictions are validated by randomly selecting 100 locations containing all classes and having the model make predictions. The human is shown the same 100 randomly selected locations and makes their predictions accordingly. Measuring the model’s prediction vs. the humans’ prediction will demonstrate the model’s computer vision and machine learning accuracy. If the model does not have human-level accuracy in the validation test, the subsequent step is to save the data and labels and create more training data to perform more transfer learning (go back to Stage one). Contingent on if the model performs predictions sufficiently, the model then moves to Stage five for deployment.
3.5. Stage Five: Model Deployement
Once the model has performed sufficient predictions, it is deployed to make predictions in the stochastic environment. Inevitably, the model will run into data that are incorrectly identified, or advertised data may reduce the accuracy and prediction confidence. The model is designed to systematically capture new data and augment data, label data, and perform transfer learning in the cloud with the ability to transfer the updated weights and bias to the model.
Figure 1 shows a flowchart of the presented model.
4. Case Study
To illustrate an implementation of the presented model, the model has been applied to a selected area in Hurricane, WV. Stage one of the proposed model is to collect data. For this research, the research team uses a drone to fly and record a video of the Region of interest (ROI) video. The researchers use a DJI Matrice 300 RTK drone equipped with Zenmuse H20T camera that has the ability to zoom up to 20X as shown in
Figure 2.
Stage Two is to upload the video recorded by the drone to the MATLAB’s Video Labeling Tool, as shown in
Figure 3.
Figure 3 shows the labeled houses. The blue labels are the shaded houses, and the orange labels are the unshaded houses.
For this study, the images are extracted from the recorded video based on two categories: shaded houses and unshaded houses, as shown in
Figure 4. A house on a hill or less shaded area will receive more direct sunlight and generate more solar energy compared to a house in a shaded area. If 20% or more of the house roof is shaded, the model labeled the house as shaded. If less than 20% of the roof is shaded, the model labeled the house as non-shaded.
In Stage three, the proposed CNN model was trained to predict shaded and unshaded houses.
Figure 4A shows a shaded house where too many trees cover the surrounding of the house, reducing the amount of solar energy collected from PV installation.
Figure 4B shows an unshaded house where no tree covers the rooftop of the house, which means more solar energy will be collected. The prediction comes with inference on whether a house is a good candidate for solar energy production.
The proposed YOLOV3 has 75 convolutional layers, including skip connections and up-sampling layers. It is a fully CNN that passes an image once through the model, with the output being the predicted classifications for object detection. The prediction accuracy is only as good as the data that are provided to train the model. If training data are inadequate in size, the regression line will not converge, and the loss will not drop below 5%.
Figure 5 shows the proposed YOLOV3 model summary
The researchers were able to extract 8847 images from the video captured by the drone. In total, 6052 (68%) of the images were used for training, and 2795 (32%) were used for validation. The model was trained for a few hours, and after 1875 iterations, the model’s loss dropped to 0.05, as shown in
Figure 6.
In Stage four, the researchers tested the model accuracy by comparing the model’s prediction vs. humans’ prediction. A sample of the same images that have been used to train the model were sent to a graduate student who never saw the images. The student went through 250 randomly selected photos and categorized them as shaded or unshaded. The researchers compared the student prediction result with the model result and found that the model accuracy is good. The model accuracy is 0.91 or 91%, meaning 91 correct predictions out of 100 total examples. The model’s accuracy is found by dividing the number of correct predictions by the total number of predictions. Out of the 91 unshaded houses, the model correctly identifies 90 as unshaded.
Once the model’s accuracy was established, the researchers moved to Stage five. In Stage five, the model is deployed to be tested on data unseen by the model. Here, 2795 new photos (never seen by the model) were used to validate the model.
Figure 7 shows the region of interest for the new data. The model systematically captured the new data, labeled them, and performed transfer learning in the cloud. The accuracy of the model performance in a stochastic environment has been tested, and
Figure 8 shows the model accuracy.