1. Introduction
Cloudiness is one of the most important parameters in astroclimatological study [
1,
2]. Essentially, an astronomical site needs to have cloud-free skies to operate successfully [
3]. Therefore, it is obligatory for any potential astronomical site to have a long-term observation of cloud distribution for monitoring the variations of the sky condition throughout the years.
It is important to understand that the interest and aim of both astronomers and meteorologists in observing the sky conditions are completely different. Generally, the estimation of sky cloudiness by astronomers is more for astronomical observation purposes. On the other hand, meteorologists monitor sky conditions with the aim of observing and studying the physics, chemistry and dynamics of the Earth’s atmosphere [
4]. In addition, cloud coverage also plays an important role in determining certain meteorological parameters such as solar incidence [
5,
6,
7,
8]. Clouds diffuse the direct solar radiation from the Sun and this affects the performance of solar-based technologies on the ground [
9].
Until the middle of the 20th century, daily observation of cloud distribution was carried out using the naked eye. Astronomers would stay outside the observatory to evaluate cloudiness and determine sky conditions [
10]. At the same time, meteorologists would measure cloud distribution by estimating the fraction of the sky that is covered by the clouds. The unit for the fraction of the sky covered by the clouds is Okta. The Okta scale ranges from zero to eight, where zero represents clear sky and eight represents overcast sky condition. Each of the Okta represents a fraction of one-eighth of the hemispherical sky. However, this direct observation approach has several disadvantages. Since the observation is performed using the naked eye, the measurement is very subjective as it is highly dependent on the experience of the observer. For example, nighttime cloud measurements between two observers may differ because the clouds are most difficult to be identified by the naked eye during that period [
11]. Apart from that, the location and method of observation also affect the readings.
Due to the fast development of computer and digital imaging in the second half of the twentieth century, cloud distribution measurements have achieved great improvements. One of the improvements is the use of a wide lens imager to capture digital imagery of the sky. The first successful observation using an all-sky imager was performed by Koehler and his team in 1988 [
12]. Following that, greater and more sophisticated versions of the all-sky imager with higher resolution image output have been developed. The Whole Sky Imager series (WSIs) and Total Sky Imager (TSI) are two examples of the all-sky imager dedicated to cloud distribution measurement [
13]. The usage of digital imagery in cloud coverage measurement at a ground-based station or observatory later led to space-based measurement via satellites [
5,
6]. On some occasions, cloud coverage can also be determined through clear sky diffuse illuminance in solar radiation modeling [
14].
As the imagers changed with time, the algorithm for cloud detection also evolved. Most cloud detection methods used today are based on the red/blue ratio of digital color channels. The digital color image of a clear blue sky will show a higher value of blue channel intensity compared to the green and red channels; therefore, the sky appears blue in color. The clouds, however, appear white or grayish since clouds scatter all the red, green and blue channels similarly [
15]. The red and blue color ratio has been widely used in previous studies [
12,
15,
16,
17]. However, the color ratio method is unable to detect thick clouds. This disadvantage has been improved by Heinle et al. who proposed a new method of cloud detection, which is called the Red-Blue (R-B) Color Difference [
18]. Later, many cloud detection methods have been developed that have better results such as the multicolor criterion method, Green Channel Background Subtraction Adaptive Threshold (GBSAT), Markov Random Fields Method, application of Euclidean geometric distance and more, which required high computation power and time [
15,
19,
20].
In this paper, we proposed a new kind of color difference method called Blue-Green (B-G) Color Difference. The objective of this study was to test the effectiveness of this simple yet effective cloud detection method and then to compare it with the commonly used R-B Color Difference method. Our analysis focused on the daytime sky only as a different approach is required for the nighttime sky because of their different characteristics.
2. Instrumentation
This study utilized images captured by the all-sky imager at PERMATApintar Observatory located in Selangor state (2°55′2.32” N, 101°47′17.44” E). In this section, we present the details on the all-sky imager, the software and the associated images.
The all-sky imager used in this study was developed by Moonglow Technologies. This imager was initially installed to monitor sky conditions from within the observatory. However, since it is capable of capturing images of the sky continuously, we further utilized its images for the cloud detection study.
Figure 1 shows the all-sky imager at the PERMATApintar Observatory and
Table 1 provides details of its specification.
The all-sky imager is controlled using the ASC Uploader software that is also provided by Moonglow Technologies.
Figure 2 shows the snapshot of the software. All the settings of the imager are controlled automatically by the software, including the time exposure. The archiving interval (i.e., the time interval between each image taken) can be set, for example at 1 min, 15 min or even an hour or more.
The outputs of the all-sky imager are in 24-bit JPEG format, at a resolution of 576 × 720 pixels. It is well known that the JPEG format is not a preferred format because of data loss caused by compression. However, in this study, we still utilized JPEG format images since the format of the output image is unchangeable. The all-sky image database at PERMATApintar observatory consists of images taken since April 2013. The archiving interval was initially set at an hourly basis, meaning that 24 all-sky images were taken and stored every day. An example of the all-sky image is shown in
Figure 3a, which was taken on 17th May 2013 at 1900 LT.
3. Pre-Processing
This section covers the required initial processes before any cloud detection can be performed.
The all-sky image was treated by the image processing algorithm as a big matrix and analyzed pixel by pixel. The algorithm scanned all the pixels in the image regardless of what the pixels may represent—sky, cloud, tree or building. Thus, the image had to be masked first so that the algorithm can only read pixel values from certain pixels.
The algorithm can be divided into three main parts: color channel splitting, masker and image processing. The parts on splitting and masker are explained in the following subsections while the image processing part is discussed in detail in
Section 4.
3.1. Color Channels Splitting
The raw all-sky images were in RGB color. Before any image processing could be performed on the raw image, the images had to be split into red, green and blue color channels. The algorithm read the RGB color image as a 3D matrix where each layer of matrix represented red, green and blue, respectively. It then returned the image data as m-by-n-by-3 array, where ‘m’ and ‘n’ represented the row and column of a matrix respectively, and 3 represented the three matrices, which are ‘1′ for red, ‘2′ for green and ‘3′ for blue matrix. By assigning each matrix to its respective channel, three grayscale versions of the raw image for red, green and blue color channels were obtained.
Figure 3 shows an example of a raw all-sky image taken from PERMATApintar Observatory and its color channels.
3.2. Masking
Before cloud detection can be performed on the all-sky images, the images had to be processed first to remove any unnecessary pixel value readings. These readings were mostly due to the Sun’s glare, which was reflected by the all-sky imager’s acrylic dome (refer to
Figure 4). Although most of the pixels were located outside the region of interest of the image, their existence could still affect the cloud coverage percentage since the algorithm went through every pixel in the image. Moreover, since the field of view of the all-sky imager is hemispherical, the image will definitely include the horizon objects such as buildings, trees, hills or mountains, grounds and more. The percentage of cloud coverage would be affected if we did not eliminate their effect in the analysis. Thus, in order to overcome these problems, we applied the masking technique on the raw images to produce clean images before carrying out further analysis. It is important to point out here that the clean image in our study does not include the removal of solar glare within the sky region due to the lack of solar obscuration in our all-sky system. Nevertheless, the clean image is still acceptable and valid as used in a previous study, such as Heinle et al. However, the clear sky percentage has to be revised specifically for the all-sky system and site.
Generally, masking is a method of obscuring certain undesirable regions, thus allowing us to deal only with the pixels inside the region of interest (ROI) chosen. In our study, the mask was actually a binary image of the same size as the raw all-sky image, which is 570 × 720 pixels. Basically, two stages of masking were involved in this study. The first stage involved creating basic masks for each type of unnecessary pixels. For this stage, two basic masks were created based on the problem discovered on the raw images. The first basic mask is called the Non-ROI mask, which masked the whole image except for the ROI part. The mask eliminated all pixel values caused by the sunlight reflected outside of the ROI. The second mask is the Horizon mask that handled the pixels that did not represent the sky region within the ROI. In this stage, a pixel value of ‘one’ represented the ROI while a pixel value of ‘zero’ represented the Non-ROI and horizon objects. The second stage involved producing a mastermask by combining both the Non-ROI and Horizon mask into one mask. The mastermask was then applied onto the raw all-sky image to produce a clean image.
We applied the mastermask by multiplying the mask with the raw image. Since the ROI pixel value was multiplied by one, the value remained the same. In contrast, the rest became zero after they were multiplied by zero. As a result, we were able to obtain pixel values only from the pixels inside the ROI since pixels in the Non-ROI had a pixel value of zero.
4. Cloud Detection Method
The cloud detection method used in this study is based on the Color Difference method. Instead of using the conventional way of R-B Color Difference, we explored a different way of cloud detection using B-G Color Difference. The cloud detection method can be divided into two main stages, which are the threshold determination and the cloud coverage stage.
4.1. All-Sky Images
A total of 15 all-sky images were chosen, which consist of five images of each sky conditions (clear, partially cloudy and overcast) within the range of May 2013 to May 2016. All of the images have to go through the pre-processing phase before thresholding or cloud coverage measurement are implemented on the images.
4.2. Threshold
In order to obtain the B-G threshold value, we compared pixel values from 30 raw all-sky images. For each all-sky image, ten pixels that were randomly chosen from the sky regions. The pixel value of each red, green and blue color channel is extracted for each pixel. The difference value between a pixel value of blue and red color channel is calculated for each of the randomly selected pixels. The threshold value was then estimated based on the average of all of the difference values for all of the selected pixels.
In this study, we had to obtain the R-B threshold value for comparison purposes. The method is the same as the B-G threshold value. However, in order to simplify the calculation, we calculate the R-B Color Difference by subtracting the blue pixel value with the red pixel value, in order to simplify the calculation and algorithm.
4.3. Cloud Coverage
By using the B-G threshold value, we were able to determine what each of the pixels represented. Basically, if a pixel has a B-G value higher than the threshold value, then the pixel is a sky-pixel. Cloud coverage can be represented by the Cloud Coverage Ratio (CCR) using Equation (1):
where N
cloud is the number of cloud-pixels and N
sky is the number of sky-pixels.
5. Result and Discussion
Based on the comparison study of B-G values between sky-pixel and cloud-pixel done in previous studies, we chose B-G = 30 as the threshold value for the current study [
22]. Any pixel with a B-G pixel value higher than 30 was therefore considered a sky-pixel whereas pixels with lower values were considered a cloud-pixel. We also used the same method to determine the threshold value for R-B and the value was also 30. The value is acceptable since it is within the optimal range as stated by Heinle et al. [
18]. This value is necessary in carrying out the comparison between B-G and R-B since the threshold value can be different depending on the location.
Figure 5 shows the comparison between R-B and B-G Color Difference for clear, partially cloudy and overcast sky conditions. Overall, both methods of Color Difference can detect the thick clouds as seen in the images of the partially cloudy and overcast sky condition. However, the new method of B-G Color Difference seems to detect the thin high-cloud better than R-B Color Difference. There were a number of pixels or regions in the clear and partially cloudy sky condition images where R-B Color Difference considered them as sky pixels even though there was a thin layer of cloud. As for B-G Color Difference, the thin layer of cloud was detected clearly. The ability to detect even a thin cloud layer is crucial in astronomical observations. This is because a thin layer of cloud can severely affect the outcome of observations such as photometry, especially when doing long exposure imaging.
In terms of cloud coverage percentage, most of the results of B-G Color Difference gave higher percentages of cloud-covered sky. This can be seen in the cloud coverage percentage comparison between B-G and R-B Color Difference presented in
Table 2. The percentage differences between them are 25.76%, 20.92% and 0.02% for clear, partially cloudy and overcast sky conditions, respectively. The percentage difference for clear and partially cloudy sky conditions shows a very large gap in cloud-pixel counts between both methods because of the thin cloud detection. These results indicate the ability of the B-G Color Difference method in detecting both thick and thin clouds, which thus increases the detection of cloud-pixels significantly. On the other hand, the significant percentage differences indicate the failure of R-B Color Difference in detecting thinner clouds. As for the overcast sky condition, we found that both methods were able to detect them perfectly with a very low percentage difference between the two readings.
In order to further check the effectiveness of B-G Color Difference in detecting thin clouds, we compared the cloud detection of both methods using cropped all-sky images of a clear blue sky and a thin cloudy region. The cloud coverage percentages are as shown in
Table 3. For the clear sky cropped image, both methods show zero percent cloud coverage. However, for the thin clouds cropped images, the B-G method was able to detect more thin clouds compared to the R-B method, as shown in
Table 3.
There is one problem that needs to be highlighted in this study, which is the lack of solar obscuration over the all-sky imager to block the direct and blinding sunlight. Without the solar obscuration, a blooming effect will occur in the image, especially when taking images under clear sky conditions.
Figure 6a shows an example of such effect while
Figure 6b shows the solar obscuration blocking the glaring sunlight. It was very difficult to identify the pixels of the overflowing limb from the blooming effect since the Color Difference was mostly approaching zero, and thus would be considered cloud-pixels. The all-sky imager installed at PERMATApintar Observatory is not equipped with a solar obscuration as the system is not meant for the purpose of this study. One possible solution for the problem is by introducing another criterion of pixel characterization, which is the average R, G and B channel of a pixel. Theoretically, a thick cloud should have a darker gray color as it absorbs the sunlight, thus resulting in a lower average value. However, we must consider other conditions that have almost the same attribute as the blooming effect. Some examples of the conditions are the bright white side of cumulus clouds due to sunlight reflection and bright light scattering by cirrus or cirrostratus clouds during sunrise, which is also known as the whitening effect. Hence, future study is needed to differentiate and characterize the pixels in order to increase the effectiveness of this type of cloud detection method.