Texture Is Important in Improving the Accuracy of Mapping Photovoltaic Power Plants: A Case Study of Ningxia Autonomous Region, China

Photovoltaic (PV) technology is becoming more popular due to climate change because it allows for replacing fossil-fuel power generation to reduce greenhouse gas emissions. Consequently, many countries have been attempting to generate electricity through PV power plants over the last decade. Monitoring PV power plants through satellite imagery, machine learning models, and cloud-based computing systems that may ensure rapid and precise locating with current status on a regional basis are crucial for environmental impact assessment and policy formulation. The effect of fusion of the spectral, textural with different neighbor sizes, and topographic features that may improve machine learning accuracy has not been evaluated yet in PV power plants’ mapping. This study mapped PV power plants using a random forest (RF) model on the Google Earth Engine (GEE) platform. We combined textural features calculated from the Grey Level Co-occurrence Matrix (GLCM), reflectance, thermal spectral features, and Normalized Difference Vegetation Index (NDVI), Normalized Difference Built-up Index (NDBI), and Modified Normalized Difference Water Index (MNDWI) from Landsat-8 imagery and elevation, slope, and aspect from Shuttle Radar Topography Mission (SRTM) as input variables. We found that the textural features from GLCM prominent enhance the accuracy of the random forest model in identifying PV power plants where a neighbor size of 30 pixels showed the best model performance. The addition of texture features can improve model accuracy from a Kappa statistic of 0.904 ± 0.05 to 0.938 ± 0.04 and overall accuracy of 97.45 ± 0.14% to 98.32 ± 0.11%. The topographic and thermal features contribute a slight improvement in modeling. This study extends the knowledge of the effect of various variables in identifying PV power plants from remote sensing data. The texture characteristics of PV power plants at different spatial resolutions deserve attention. The findings of our study have great significance for collecting the geographic information of PV power plants and evaluating their environmental impact.


Introduction
Solar energy is the most commonly available renewable energy source with a great potential to replace fossil fuels while reducing greenhouse gas (GHG) emissions to limit climate change [1,2]. Photovoltaic (PV) technology can convert solar energy directly into electricity with large arrays of solar panels [3]. With PV technology and industry development, the cost of electricity generated by PV power plants has declined to the same level as that generated by traditional fossil-fuel power plants [4]. According to the International Energy Agency (IEA), the global installed PV capacity has increased from about 1.25 GW in 2000 to more than 627 GW in 2019. With the establishment of the carbon neutrality goal of the majority of countries worldwide, the generation capacity of PV power its cloud computing ability. The RF is an ensemble learning method that uses a set of decision trees for classification and regression tasks with advantages of high precision, efficiency, and stability [53]. The L-8 is a medium-resolution multispectral satellite that has the advantage of providing free images worldwide with spatial resolution at 30 m, a revisiting period of 16 days, and multispectral information simultaneously to map PV power plants.
In consequence, the main goals of this research were to (1) evaluate the effect of textural variables of different neighborhood sizes on the RF model's performance, (2) evaluate the effect of topography and thermal spectral variables on the RF model's performance, and (3) map the utility-scale photovoltaic power plants based on the optimized model. The main novelty of our work is evaluating textural, topological, and thermal variables in an RF model to classify PV power plants with medium-resolution multispectral satellites. Our study has great significance for collecting the geographic information of PV power plants and evaluating their environmental impact.

Study Area
This study classified PV power plants of utility-scale with a generation capacity over 1 MW or an area exceeding 0.21 km 2 [17] in Ningxia autonomous region (short as Ningxia), China. Ningxia locates in the arid area of northwest China with abundant solar energy resources ( Figure 1a) and has the forefront installed capacity of PV power plants. The elevation difference of Ningxia is about 1000 m. The diverse types of landscape, such as mountain, river, desert, plain, forest, shrubland, grass, and farmland, also make Ningxia an ideal region to test the model's performance for the classification of PV power plants.

Landsat-8 Surface Reflectance Imagery and Composite Image
L-8 surface reflectance (SR) product was used in this study [54]. The total number of L-8 senses used in this study is 234. The Thermal Infrared Sensor (TIRS) and the Operational Land Imager (OLI) are the two instruments aboard L-8 (TIRS). The OLI has seven reflective bands with a 30 m spatial resolution and a panchromatic band with a 15 m spatial resolution. At a spatial resolution of 100 m, the TIRS provides two thermal infrared bands. In this study, we used six reflective bands, including blue, green, red, near-infrared (NIR), two shortwave infrared bands (SWIR1 and SWIR2), and two thermal infrared, or brightness temperature (BT), bands, which are named as B2, B3, B4, B6, B7, B10, and B11 in L-8 imagery, respectively. L-8 SR products have been atmospherically and topographically corrected. Using the pixel quality control band integrated with the product, we removed the pixels contaminated by clouds and shadows in each image (only keeping the pixels with a quality control value equal to 0). We further composited L-8 image datasets using the median value of six reflective bands and two thermal bands in 2020, respectively. With the scenes provided in one year, the composite image was robust against extreme values and can provide enough information [55].

Google Earth Engine Cloud Computing Platform and Random Forest Classification
Identifying the fast growth of PV power plants on a regional scale needs extensive computing and storing resources. The Google Earth Engine (GEE), a cloud geospatial computing platform with flexible programming that supports massive remote sensing data and multiple machine learning methods [56], is an appropriate tool to solve model computing and data storing difficulties. With GEE's support, researchers in the remote sensing community have completed numerous classification works on a continental planetary scale [48,[57][58][59][60][61][62][63][64][65][66]. As a result, we used GEE to estimate and evaluate the model and classify the PV power plants in this study.
We used a pixel-based RF algorithm on the GEE to map the PV power plants in this study. The RF classifier is an ensemble classifier that uses a set of decision trees to predict with advantages of high precision, efficiency, and stability, which is also less sensitive than other machine learning classifiers to the quality of training samples and overfitting [47,53]. The RF classifier has also been proven to be better than other standard-used machine learning classifiers on the GEE platform [67,68]. In this study, we set the number of trees as 500 for the RF classifier [69,70]. We set the rest of the parameters as default settings on GEE.

Training and Validation Samples Collection
Training an applicable RF model requires massive training samples to cover as much of the system parameter space as possible. The RF classifier is sensitive to the sampling design [53]. Suitable training samples could ensure classification accuracy and stable performance of an RF-trained model. In this study, we collected and labeled data as PV region and non-PV region. We primarily collected the PV sample dataset from Dunnett's dataset in Ningxia, a global solar plants dataset annotated by volunteers [26]. The pixels on the edge of the PV power plants mixed by PV panels and non-PV features could weaken the classification model and affect the result. As such, we further manually modified this dataset by visual interpretation with Google Earth's background in 2017 to ensure the PV samples were located inside the PV power plants.
We manually selected and edited the extent of some extra PV power plants which were not annotated in Dunnett's dataset by visual interpretation with Google Earth's background. We stored this dataset as polygon vectors and then sampled points from the polygons. We collected non-PV region samples from (1) adjacent regions of PV power plants within five-kilometer buffer regions, (2) the samples from manfully selected typical land types, including cropland, forest, water, urban area, and barren area, and (3) the samples from the whole Ningxia autonomous region. In total, we prepared 4000 points labeled as PV region and 20,000 points labeled as the non-PV region in this study (Figure 1b). At last, we randomly chose 75% of the total points as the training set and the left 25% as the validation set. We used the hold-out method to repeat ten times of choosing training set and validation set to eliminate the impact of sampling differences on model assessment [71,72].

Variables Estimation
We separated the variables into four groups, which are reflectance spectra (G1), gray texture (G2), topography (G3), and thermal infrared spectra (G4). The variables from reflectance spectra include six bands and three calculated indices. The six bands were blue, green, red, near-infrared, and two shortwave infrared bands. The three indices were the Normalized Difference Vegetation Index (NDVI) [73], the Normalized Difference Built-up Index (NDBI) [74], and the Modified Normalized Difference Water Index (MNDWI) [75] ( Table 1). The NDVI, NDBI, and MNDWI are sensitive to variations of vegetation, water, and buildings, respectively, and are commonly used in the RF model as variables to classify land cover types [76][77][78]. We computed the texture variables from the Gray Level Co-occurrence Matrix (GLCM) from the gray image scaled from the NIR band. The NIR band could help recognize the texture of PV power plants due to the spectral characteristics of vegetation and sand with high reflectance and solar panel with low reflectance in the NIR band [31]. The GLCM is a matrix that tallies frequencies of values for clusters of pixels, normalizing probabilities within a neighborhood, which can be used to calculate various statistical texture measures. We set the neighborhood sizes of GLCM as 1, 5, 10, 15, 20, 25, 30, 35, and 40 pixels in this study. We selected eight variables from GLCM, including angular second moment, contrast, correlation, entropy, variance, inverse difference moment, sum variance, and dissimilarity [39,41].
Topography variables included elevation, slope, and aspect calculated from the Shuttle Radar Topography Mission (STRM) DEM [79].

Model Assessment
We evaluated the pixel-based RF model by out-of-bag error (OOB). OOB is a method of measuring the internal prediction error of RF utilizing bootstrap aggregating (bagging) [80].
We also evaluated the variable importance in the model measured by the mean decrease in Gini (MDG). The Gini index measures the node impurity, and the MDG measures the average of the total decrease in node impurities from splitting on the variable [81][82][83]. The variable importance measures are used as variable selection criteria to reduce variables and improve classification results.
We further evaluated the model performance with a validation set classified by a trained RF model. By comparison with the confusion matrix of categorized and labeled points in the validation set, we used the kappa coefficient, overall accuracy (OA), producer's accuracy (PA), and user's accuracy (UA) of the validation set to assess the performance of a model [84]. The kappa coefficient calculated from the confusion matrix is widely used to check consistency and evaluate model performance. The overall accuracy is measured to examine the overall efficacy of the model. The producer's accuracy indicates the proportion of truth samples correctly judged as the target class. The user's accuracy indicates the proportion of samples judged as the target class on the classification map present as truth samples.
We also used the McNemar test to assess whether the difference in classification results is significant [85]. The test of McNemar is based on a chi-square (χ 2 ) statistic that calculated from a 2 × 2 matrix of the corrected and incorrect pixels of classified results, computed as follows: where f 12 is the number of pixels correctly classified by method one while incorrectly classified by method two, and f 21 is the number of pixels correctly classified by method two while incorrectly classified by method 1.
The workflow of our methodology is shown in Figure 2. The L-8 and DEM datasets, raster calculation, GLCM calculation, RF classifier, kappa, OA, PA, and UA calculated from the confusion matrix are available on the GEE platform. The McNemar test was calculated in the R language.

The Effect of GLCM Neighbor Sizes on the Performance of the Random Forest Model
By comparing the model trained with variables from reflectance to the model trained with additional textural variables, we discovered that the additional textural variables positively impact the model's performance in identifying PV power plants ( Figure 3). The additional textural variables with different neighbor sizes, except the size of 1, significantly improved the model's performance (p < 0.01) with different effects, such as decreasing the OOB of the model and increasing kappa values and accuracies from validation sets. The variations in OOB were consistent with the variations in kappa and OA among the model with different variable sets. The OOB decreased as the kappa coefficient and overall accuracy increased. As can be seen, the model's kappa values and OA increased as the neighbor sizes increased from 1 to 30, peaked at 30, and then remained relatively constant or even dropped as the neighbor sizes exceeded 30. Similar to the variations of kappa and OA, the UA and PA of PV power plants or non-PV power plants all reached their maximum value at the neighbor size of 30 to 40 (Figure 3d-g). Given the extra computation of larger neighbor sizes, we deemed that the textural variables with a neighbor size of 30 fitted the model best. According to the assessment of the model trained with only variables from reflectance (G1), the additional textural variables (G2) with the neighbor size of 30 decreased the model's OOB by 0.78%, from 2.47% to 1.69%, increased the model's kappa by 0.034, from 0.904 to 0.938, and increased the model's OA by 0.88%, from 97.45% to 98.33% (Table 2).  Note: Out-of-bag error (OOB), kappa coefficient, overall accuracy (OA), producer's accuracy (PA), and user's accuracy (UA), Reflectance (G1), Texture (G2), Topography (G3), Thermal (G4). The detailed information of variables can be found in Table 1.
We further compared the variable importance in the model trained by textural variables with different neighbor sizes. The result showed that the sum importance of the textural variables increased from 38.06% to 45.37% as the neighbor size increased from 1 to 40 (Figure 4b). Among the importance of the textural variables, the sum average from GLCM consistently ranked at the top. In contrast, other textural variables' ranks changed slightly in the model with each neighbor size. However, the importance of each textural variable all increased as the neighbor size increased from 1 to 40.

The Effect of Topology and Thermal Spectra on the Performance of the Random Forest Model
After determining the fittest neighbor size of GLCM textural variables, we further evaluated the effect of topology on the model's performance based on reflectance variables and the textural variables with a neighbor size of 30. The result showed that the topographic variables further improved the performance of the model that significantly decreased the model's OOB by 0.11%, from 1.69% to 1.58% (p < 0.01), increased the model's kappa values by 0.04, from 0.938 to 0.942 (p < 0.01), and increased the model's OA by 0.11%, from 98.44% to 98.33% (p < 0.01). It is worth noting that the improvement of kappa and OA could be ascribed to the improvement of the producer's accuracy of PV power plants, which increased by 0.77%, from 91.37% to 92.14% (p < 0.01).
We also evaluated the thermal spectra as BT of L-8 on the model's performance since the LST of PV power plants is different from their adjacent regions [13,52]. However, the extra variables from thermal spectra did not significantly improve the classification performance of the model. Figure 5 showed the variable importance in the model trained with variables from reflectance spectra, texture (neighbor size of 30), topography, and thermal spectra. We found that NDBI, NDVI, and MNDWI ranked the 1st, 2nd, and sixth most important variables, respectively. Apart from the three indices, the top three variables from individual reflectance spectra were NIR, SWIR1, and green, ranked as the 4th, 12th, and 13th, respectively. For the variable importance of textural variables, sum variance, variance and correlation ranked at the 5th, 8th, and 9th of all variables. The importance of variables from topography was quite different. Elevation ranked as the 3rd, respectively, while aspect ranked at the bottom of all variables. Lastly, the two thermal bands ranked 14th and 16th.

Classified Map of PV Power Plants
We mapped the PV power plants in the Ningxia autonomous region with the pixelbased RF model with different variable sets ( Table 2) from L-8 composite images. We used the training set with the best performance from the ten-time random hold-out sampling. We also calculated the areas mapped from the model with different variables and made a McNemar test between different variable sets with the validation dataset. The classified PV power plants in the Ningxia autonomous region were from 343.70 to 432.80 km 2 (Table 3). Based on the McNemar test, Table 4 showed that the classified results based on extra G2 variables were significantly different from those based on only G1 variables. The classified results based on extra G3 or G3 and G4 variables were insignificantly different from the classified result based on G1 and G2 variables. We also calculated the area of PV power plants labeled in Dunnett's harmonized global dataset [26], which covered 71.76 km 2 . Our RF model can provide more information about the distribution of PV power plants than the published dataset. We developed an interactive online app from the GEE platform to show our classified results in China's Ningxia autonomous region ( Figure 6). The website of this app is (https://xunhezhang.users.earthengine.app/view/ningxia-pv-power-plants (accessed on 29 September 2021)).

The Importance of Textural Variables in the RF Model to Map PV Power Plants
The reflectance spectrums are the essential information used to extract ground features from remote sensing. The PV panels are made of monocrystalline or polycrystalline silicon. The PV panels are also covered by transparent materials such as glass and ethylene-vinyl acetate (EVA) on the surface of PV panels. The PV panels have a relatively low reflectance (less than 0.05) in the spectrum's visible and near-infrared portion (0.45 to 0.90 µm) due to the characteristics of these materials. They have relatively higher reflectance of about 0.10 and 0.07 at the wavelength of 1.6 µm and 2.2 µm, respectively [31]. The spectral characteristic of PV power plants is quite different from other ground features, such as desert, water, crops, and buildings. Thus spectrums are the primary variables to identify the PV power plants. The reasonably good performance of the RF model trained with variables from reflectance (G1) also suggests that spectral information is essential for identifying PV power plants.
However, PV power plants are a mixture of PV arrays, shadows, and different types of soil or vegetation [14]. The mixture inevitably leads to PV power plants having similar spectrums with other objects on a pixel scale over large regions [27,31]. As a result, valuable features outside the spectrum are also essential in improving the accuracy of mapping PV power plants. Spatial autocorrelation commonly exists among the pixels of ground features. The texture as a pattern of spatial autocorrelation is crucial for recognizing ground features in remote sensing [38]. The regular spatial distribution of PV arrays, roads, generation facilities, and even construction scale produce texture features in a PV power plant (Figure 7b). To test the texture of PV power plants from L-8, we examined the effect of eight texture variables calculated from GLCM with different sizes on the RF model. The statistic values from texture are closely related to the neighborhood or window size in the GLCM techniques. An unsuitable window size that is too large or small fails to maximize the use of texture information to improve the model's performance [42,86]. The acquisition of texture information is also related to the resolution of remote sensing images [86]. In our study, the pixel resolution of imagery from L-8 is 30 m. The texture of PV power plants from the width of PV panels and the distance between PV panels, which are only within meters, is hardly be detected by the L-8 sensor. The result that the texture with one neighbor size, equal to a moving window of 3 by 3, has little effect in improving the model's accuracy proves the resolution of L-8 is too coarse to acquire the texture produced by the PV panels and spaces between PV panels. In our study, the model's fittest neighbor size of GLCM texture was about 30 pixels, equal to a moving window of 61 pixels by 61 pixels or a square of 1800 m by 1800 m. The texture of PV power plants, which the L-8 sensor could detect, can only be produced on a similar scale. This texture comes from the regular distribution of the roads and generation facilities, such as buildings for inverters and controllers (Figure 7b). Our result showed that PV power plants have prominent texture characteristics over hundreds of meters that the satellite platform could detect with medium resolution imagery. The additional textural variables with a neighbor size of 30 increased the model's overall accuracy from 97.45% to 98.33%. Our finding demonstrated the importance of textural variables in recognizing PV power plants. Nevertheless, the model trained by textural variables from such a big moving window can increase the overall accuracy. However, it may not benefit from identifying PV power plants that lack texture information because their small construction areas are much smaller than the moving window.

The Effect of Topographic Variables and Thermal Infrared on the RF Model to Map PV Power Plants
Theoretically, the topographic variables do not improve the classification accuracy in regions with a homogeneous topography, such as plains. Additionally, in terms of construction, the elevation of the PV power plants is not as specific a required standard compared with the slope and aspect [21,49]. PV power plants are mainly built in areas of a gentle slope. In hilly or mountainous areas, the terrain aspect for constructing PV power plants should also try to face south to obtain more solar energy in the northern hemisphere.
However, the topographic variables positively affect the model's accuracy to map the PV power plants. The importance of elevation is much higher than that of the slope and aspect from variable importance. The result is related to the local topography and distribution of the PV power plants. In Ningxia, the PV power plants are typically built in low-elevation and flat areas rather than high-elevation and steep areas. As a result, the model also tended to identify these pixels under similar topographic conditions as PV power plants in Ningxia based on the training samples. The elevation of the PV power plants in other regions varies greatly [13]. As a result, the variable of elevation may even decrease the generalization ability in other regions. The impact of topographic variables on the accuracy of the RF model used to classify PV power plants in various regions warrants further investigation. Additionally, due to the requirements of the design standards for PV power plants on topographic factors, setting the threshold of topographic factors may be more effective in excluding non-PV areas.
The thermal bands which stand for BT that can retrieve LST also provide important information to classify various land covers [87]. The land surface temperature of the PV power plants is different from the surrounding features [13,52]. However, our results suggest that the thermal bands from L-8 imagery have little effect on the model's accuracy in Ningxia. The surface temperatures of the PV power plants and other ground objects are more likely to be similar over a large region. Compared with other ground features, the thermal characteristics of PV power plants are not more apparent than reflectance characteristics. Meanwhile, this result may also be due to the thermal infrared bands (100 m) compared to the other six spectral bands that weakened the surface temperature information.

The Different Platforms to Map the PV Power Plants
Nowadays, some satellite and sensor platforms, such as Sentinel-2 and Worldview-3, also can provide comparable multispectral and higher spatial resolution images than Landsat-8 [88,89]. These sensors can observe more details inside the PV power plants and get various textural features with different spatial resolutions. However, acquiring higher spatial resolution images requires higher costs for data storage and analysis. The high spatial resolution images have the advantage and potential to identify PV arrays with small sizes, such as the distributed PV arrays on the roof of building in the urban areas.
Additionally, synthetic aperture radar (SAR) sensors, such as Sentinel-1, which can acquire imagery regardless of the weather globally, can potentially identify PV arrays [90,91]. The difference of reflected electromagnetic waves in different directions between PV arrays and other ground features can provide the machine learning model. As a result, these remote sensing data sources are worth exploring in future studies.

Conclusions
With global climate change, PV power plants are rapidly expanding. Rapid and accurate mapping of solar power facilities is critical for policy management and environmental assessment. This study evaluated the effect of textural variables of different neighborhood sizes and topographic and thermal spectral variables on an RF model's performance to identify the PV power plants with L-8 imagery. We demonstrate that the variables of texture positively affect the RF model's ability to identify PV power plants. The textural variables with a neighbor size of 30 fitted the model best with the L-8 imagery. The effect of topographic variables on the model's ability to classify PV power plants in various regions warrants further investigation. The extra variables from thermal spectra had little effect on the performance of the model.
Our study extends the knowledge of the effect of various variables in identifying PV power plants. We also provide an example of using the GEE platform to evaluate the RF model's performance to classify ground features in large regions. Our research is of great significance for collecting the geographic information of PV power plants and further evaluating their environmental effects.