4.1. Coordinate Transformation Formula
In the solar location part of the algorithm (see
Section 3.1), the coordinate transformation from the horizontal coordinates to the pixel coordinates was established. In this way, the time and location information were used to calculate the SZA and SAA, and then the coordinate transformation was used to obtain the solar position in the images.
The coordinate transformation was constructed by polynomial surface fitting. The specific polynomials were determined according to the row and column scatter plot in order to take into account the various combinations of SZA and SAA from the first to fifth order. Two evaluation indicators, namely root mean square error (RMSE) and adjusted coefficient of determination (Adjusted R-square, R
2), were selected to find the best polynomials and were defined as:
where
yi is the actual value of the
i-th sample,
ŷi is the predicted value of the
i-th sample,
is the average of the actual values,
p is the number of explanatory variables, and
n is the sample size. The RMSE value indicates the dispersion of the sample and the RMSE value close to zero indicates that the model fit well. The Adjusted R-square value ranges from 0 to 1, and a value closer to 1 indicates a better fit of the model for the samples. In this work, the RMSE and the Adjusted R-square described the fitting error of pixel coordinates row and column. The units of row and column are pixels.
The results of the row coordinate transformation for various combinations of orders are listed in
Table 1. It can be observed that the row converting coordinate had the best performance in the fourth order SZA and the fifth order SAA (see the bold numbers in
Table 1). Considering the various cases of SZA to RMSE, the results of the first order SAA were the worst for all orders and the gap was large, which means that the data were not distributed in the first order. The fifth-order SAA were the best, indicating that the data were suitable for the fifth-order distribution of SAA. The Adjusted R-square was extremely inferior to first-order SZA and first-order SAA, which may be attributed to the difference between the three-dimensional surface and the three-dimensional plane. When SAA was not first order, the Adjusted R-square reached values above 0.9. Although the result of the fourth order and fifth order was not unique, it still had the best value.
Table 2 shows the results of the column coordinate transformation, and comprehensive analysis shows that the combination of the fifth-order SZA and the fifth-order SAA was the best (see the bold numbers in
Table 2). The RMSE had a wide range, and the best result can be understood as corresponding to the sun’s position fitted to be, on average, 1.505 pixels away from the marked position. The Adjusted R-square of any combination reached 0.9 or higher. Therefore, an appropriate column coordinate transformation was fitted.
The method for locating the sun provides a solution in the case where the information is insufficient and where the equipment cannot be tested experimentally. This situation is quite common. For example, when the data are obtained from a public platform, a research institution, a university, or an enterprise, the software interface of the equipment cannot be examined, and there is also no way to use the OcamCalib toolbox to calibrate the camera. Our method can locate the sun in the image using only the image, time, and device location.
4.2. Parameters for Sky State
Depending on the state of the clouds and sun, the images were divided into four categories, as shown in
Figure 8. The first category of overcast sky is shown in
Figure 8a where the sun was not observed and the sky was generally covered by clouds. The second type is cloudy without sun, shown in
Figure 8b, where there were both clouds and clear skies, but the sun was blocked by the clouds.
Figure 8c shows an image with clouds and clear skies that was placed in the cloudy with sun category because of the visible sun. The final category is that of clear sky, as shown in
Figure 8d, where the sun was extremely clear and few clouds appeared
To determine the sky state, two parameters, namely SI and SD, were proposed (see
Section 3.2.2). The SI was obtained by calculating the average intensity of the pixel block at the center of the solar disk. The statistical results are listed in the SI column of
Table 3. The SIs of the cloudy skies without sun and overcast skies images were lower than those of the other image categories and their values were quite different. Moreover, the images in the overcast sky category had a lower SI value than the images for cloudy skies without sun. To identify whether the image had sunlight interference, the overcast sky without interference was distinguished from other categories and the threshold was set to 180.
Given the layering region division, the average saturation (S) of the first layer to the first four layers was calculated, and the ranges are listed in the S1–S4 columns in
Table 3. The saturation was mostly either increasing or decreasing from the first layer to the first-fourth layer for any category, but there were some images in which the saturations of the first-second and first-third layers changed in the vicinity of the saturations of the first and first-fourth layers. Fifty images were taken randomly. For 90% of the images, the saturations were either increasing or decreasing, and for 10%, turbulent changes were observed. The first layer and the first four layers fully represented the changes in the saturation. For the first layer, the saturations of the images in the cloudy skies and clear skies categories had the large range of values due to the sunlight and clouds. The saturation of the clear sky was generally relatively large, but the saturation in the morning and evening led to the low value. For the first-fourth layer, the saturation increased sequentially from overcast to clear skies. The difference between the first-fourth layers and the first layer indicates whether the sky state of the first layer is consistent with that of the entire image and distinguishes the overcast skies from the other categories (see the SD column in
Table 3). The SD of overcast skies was less than 0.08 and the threshold was set to 0.1.
4.3. Analysis of Channels and Features
To distinguish clouds and clear skies, the channels were analyzed. The RGB values are separately extracted along the red stripe in
Figure 8a, and the distribution of the values representing the clouds and clear skies is presented in
Figure 8b. The clear sky pixels were distinctly observed at 0–75 and 180–240, and cloud pixels were distributed in the 75–180 range. Bright cloud pixels were mainly distributed at 75–130, and dark cloud pixels were distributed over 130–180. It can be clearly observed that the two kind of clouds differed in that the channel value of the bright cloud pixels was larger than that of the dark cloud pixels. It is important to mention that the blue component of the bright cloud pixels was easily saturated because this component was limited to the range of 0–255. Of course, it is also possible that all three channels were saturated. It can be observed that the blue component was greater than the green component and the red component and the values of the cloud pixels were higher than those of clear sky pixels in each color component.
An unexpected but vital clue was contained in the rule of stable centralization of the green component, which was a ubiquitous phenomenon in the images used in our work. For further analysis, the chromaticity coordinates were presented as:
This expression reflects that
r and
b were presented in axisymmetric form with
g as the axis, as shown in
Figure 8c. The
g was stable at the value of 0.33, verifying the rule. If two of the three channel values are known, the remaining channel value can be approximated. This implies that the complete pixel state can be expressed with only two channels. The
r and
b values of the clear sky pixel were similar to each other, and those of the cloud pixels were quite different. This phenomenon can be expressed in a more intuitive manner through the distance between
r and
g.
To date, RBR has been widely used:
The r and b of the clear sky pixel were close to each other, and the ratio of r to b was large. The r of the cloud pixel was far from b, and the ratio of r to b was small. The same conclusion can be obtained by observing r and g.
The NRBR is meant to represent the blueness of the sky and improves image contrast and robustness to noise:
The NRBR is the normalization of RBR and is a monotonically decreasing function of the RBR. The RBR range was theoretically (0, 255) and the NRBR range was (−1, 1) (Figure 10a). However, the RBR range in the sky image was (0, 1), as was the NRBR range. Figure 10b shows the relationship between the two features. The NRBR was converted to the chromaticity coordinates. When the blue channel was not saturated, g was approximately 0.34 and NRBR was approximately the function of r.
The S feature was designed to identify the saturation of the sky, and its theoretical basis is consistent with the features discussed above. According to the definition of chromaticity coordinates, S can be derived as the function of r, which is the same as the NRBR approximation function.
To numerically understand the three features, the feature values on the red stripe in
Figure 8a are plotted in
Figure 9d. The trends of the three features were observed to be similar. The results of RBR and NRBR were fully compatible with the relationship shown in
Figure 10b. Compared to RBR, although the nonlinear compression of NRBR reduced noise, it was prone to reduce the contrast between clouds and clear skies. The consistency of the chromaticity coordinate function of S with the approximation function of NRBR was verified. The analysis showed that RBR and NRBR measured the difference between
r and
b, and S and NRBR were regarded as the measures of
r. The measure of the difference between
r and
g can obtain similar results. Thus, the ARGD feature was proposed. ARGD was not only consistent with the theory, but also corrected the blue component saturation. The computational cost was minimized, and the adjustable weight was highly suitable for cloud detection considering the interference of sunlight.
4.4. Comparison and Analysis
Our algorithm was compared to the algorithms using RBR with a fixed threshold, S with a fixed threshold, and HYTA, which flexibly uses fixed and minimum cross entropy (MCE) adaptive thresholds on the NRBR feature images.
The ARGD adopted the constant zero threshold, while the fixed threshold of other features were statistically estimated as described by [
22]. A quarter of the image set was used to calculate the mean
μ and the standard deviation
δ of the clouds, and the threshold was finally set as
T = μ ± 3
δ based on the probability theory of Gaussian distribution, with
TRBR = 0.36,
TS = 0.639, and
THYTA = 0.452. The clouds in various features had different properties. Therefore, the pixels for which the feature values were smaller than the threshold were classified as clouds for S and HYTA, while the feature values of clouds must be greater than the threshold for RBR.
The numerical analysis of RBR, S, NRBR, and ARGD and image comparison of the corresponding algorithms are illustrated under different sky conditions (
Figure 11). A linear histogram was plotted to analyze the performance of the pixels on the red line in the sky image under different features in detail. For better observation, ARGD was displayed in normalized form.
Figure 11a shows the experimental results for a clear sky. It is observed from the linear histogram that cloud pixels were identified from 0 to 58 to RBR with the fixed threshold, and from 0 to 49 to S with the fixed threshold. Since the influence of sunlight caused the sky standard deviation to be large, the HYTA algorithm used the MCE adaptive threshold (0.529) on the NRBR, and clouds were identified from 0 to 67. The values of ARGD were negative for the entire red line, correctly identifying the clear-sky pixels at zero threshold. The misrecognition of other features in cloud pixels was considered to be caused by the interference of sunlight. Two parameters, namely SI and SD, were calculated to determine the sky state. The SI indicates that the brightness of the sun was high, so the sun appeared in the image. The SD indicates that the sky saturation was unevenly distributed, so the image was disturbed by sunlight. At this time, different layers had different weights. The performances of the four algorithms were also shown and it was observed that RBR detected more clouds than S, but fewer than HYTA, while ARGD reduced most of the interference of sunlight and showed excellent performance.
The image analyzed in
Figure 11b contained both clouds and clear skies and suffered from interference by sunlight. For the RBR in the linear histogram, the cloud pixels were identified over the ranges of 0–64, 66–132, and 134–224, with the rest of the pixels identified as clear skies. Cloud pixels were identified over the ranges of 0–64, 66–124, and 134–217 in the S. NRBR used MCE adaptive thresholds of 0.6, 0–228, and 232–237 to be identified as clouds. The clouds were identified from 10–63 and 136–217 for ARGD, obtaining results that were consistent with the actual sky conditions. Due to the interference of sunlight, misrecognition was still present. The parameters of the sky state indicate that the sun appeared in the image and the image had sunlight interference, meaning the weight was determined. The detection results of the four algorithms coincide with the cloud distribution of each feature in the linear histogram. It can be seen that the ARGD result was particularly outstanding compared to other features when the image was interfered by sunlight. This indicates that ARGD successfully reduced solar interference.
The overcast sky is analyzed in
Figure 11c. All four algorithms used the fixed threshold and the pixels on the entire red line were correctly identified as clouds. The SI indicates that the brightness of the sun was small, so the sun did not appear in the image. The SD indicates that the sky saturation was evenly distributed, so the image was not disturbed by sunlight. In this case, the weights of the layers were the same. The detection results of the algorithms correctly reflect the actual situation of the sky. Overall, ARGD performed well under three sky conditions and outperformed other algorithms when the image was interfered by sunlight.
A classic confusion matrix evaluation method [
24] was used to assess the accuracy of detection. Four statistics can be clearly observed in the matrix shown in
Table 4. True positive (TP) and true negative (TN) represent the correct detection for clouds and clear skies, respectively. False negative (FN) and false positive (FP) represent incorrect detection. Several classical evaluation indices, such as accuracy, recall, and precision, can be obtained using these four statistics. Accuracy, which is defined as the percentage of correct detection, is one of the most widely used evaluation indicators, and is used in many studies used as the sole criterion for the evaluation of the algorithm performance. In this work, recall can be interpreted as the proportion that is accurately detected in the clouds and precision is the fraction of the pixels predicted as clouds that are in fact cloud pixels. Although a higher accuracy indicates the better performance of the algorithm, it should be accompanied by the balanced values of the other criteria.
The confusion matrix of ARGD for evaluating the performance of the algorithm is listed in
Table 5 with the statistics expressed as percentages. The probability that clear skies were misclassified as clouds was as low as 1.34%, suggesting that the algorithm reduced the interference of sunlight. The case where the clouds were misclassified as clear skies was considered to be caused by the dark clouds in the area interfered by sunlight. Overall, the correct classification of 98.02% is quite satisfactory, demonstrating the effectiveness of the algorithm.
The evaluation results for the four algorithms are presented in the histogram (see
Figure 12). The recall of RBR was 99.42%, which was the highest among the four features, indicating that the majority of actual clouds were identified. The precision indicates that 9.17% of the pixels should be clear skies but were falsely determined to be clouds, and the accuracy was 92.98%, ranking third among all features. The accuracy and precision of S were 94.04% and 92.29% respectively, and the recall was 6.97% higher than the precision. The three statistics for S showed the second highest values among the four algorithms.
The accuracy of HYTA was the lowest at 91.29%. The recall of 97.97% is not low, but the precision of 89.78% is not good, so the tradeoff between these criteria was observed. Different threshold methods were employed to determine whether the image was unimodal or bimodal in this algorithm. The images of cumuliform and cirriform clouds segmented with adaptive thresholds were bimodal, and the images of stratiform clouds and clear sky with fixed thresholds for segmentation were unimodal. Different from the work of HYTA, the set of images in this paper was not image patches containing only a single cloud type. Clouds of different types inevitably appeared in an image to be processed, so the detection accuracy was decreased. The sky in a small area was uniform to the color change, but the entire clear sky was interfered by the uneven sunlight, so the unimodal result was detected. This is one of the reasons why HYTA performed poorly in this case. In addition to the interference of sunlight, the presence of multiple peaks also gave rise to the error of the MCE threshold.
Compared to other algorithms, ARGD had the highest accuracy of 98.02%, and moreover, the recall and precision were the most balanced. Although the recall rate of 99.02% was the lowest, it was not much different from the recall values of the other algorithms. Overall, ARGD obtained outstanding detection results. The outstanding performance was due to the weakening of the effect of sunlight, demonstrating that interference by sunlight was the difficulty that must be overcome to obtain improved accuracy. In summary, ARGD reduced the interference of sunlight and achieved better performance. Therefore, it is the most promising approach.