Daytime Cloud Detection Algorithm Based on a Multitemporal Dataset for GK-2A Imagery

Cloud detection is an essential and important process in remote sensing when surface information is required for various fields. For this reason, we developed a daytime cloud detection algorithm for GEOstationary KOrea Multi-Purpose SATellite 2A (GEO-KOMPSAT-2A, GK-2A) imagery. For each pixel, the filtering technique using angular variance, which denotes the change in top of atmosphere (TOA) reflectance over time, was applied, and filtering technique by using the minimum TOA reflectance was used to remove remaining cloud pixels. Furthermore, near-infrared (NIR) and normalized difference vegetation index (NDVI) images were applied with dynamic thresholds to improve the accuracy of the cloud detection results. The quantitative results showed that the overall accuracy of proposed cloud detection was 0.88 and 0.92 with Visible Infrared Imaging Radiometer Suite (VIIRS) and Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observation (CALIPSO), respectively, and indicated that the proposed algorithm has good performance in detecting clouds.


Introduction
Clouds are visible masses of condensed water vapor, ice crystals, or other particles in the atmosphere [1]. Clouds are efficient modulators of Earth's radiative budget and control many other aspects of the climate system [2] because clouds reflect solar radiation back to space and restrict the emission of thermal radiation from Earth [3]. Clouds are affected by the presence of aerosols and modify the atmospheric composition in several ways, including the depletion of ozone when they form in the polar stratosphere [4]. Therefore, the Global Climate Observing System (GCOS) has selected cloud properties, which are considered suitable for global climate observation and have a significant impact on the needs of the United Nations Framework Convention on Climate Change (UNFCCC) and other stakeholders, as Essential Climate Variables (ECVs) [4]. On the other hand, because roughly 50% of the surface is covered by clouds at any time, cloud detection is an essential and important process of remote sensing data when surface information is required for various fields, e.g., crop and drought monitoring, land use and land cover (LULC) classification, surface temperature retrieval, and snow-covered area retrieval [5][6][7][8]. For these reasons, several institutions around the world are generating cloud products using satellite images.
For cloud detection using satellite data, several types of retrieval algorithms, including spectral-based, multitemporal image-based, and machine learning-based methods, have been developed [5,[8][9][10][11]. Mahajan and Fataniya [1] provided algorithms for cloud detection from satellite imagery classified by the existence of clouds, cloud types, snow/ice detection, and cloud/cloud shadows from 2004 to 2018.
Spectral-based methods use the different signals of individual spectral bands. Specifically, clouds reflect more shortwave radiation and are colder than the surface, as measured by thermal infrared (TIR) [12]. Spectral-based methods mainly use thresholds of individual et al. [10] developed the cloud segmentation CNN (CS-CNN) with Spinning Enhanced Visible and InfraRed Imager (SEVIRI) on Meteosat Second Generation (MSG) satellite. The CS-CNN was based on the U-Net architecture in terms of the composition of a sequence of downsampling layers, followed by another sequence of upsampling layers. The true (reference) dataset was the CLoud property dAtAset using SEVIRI-Edition 2 (CLAAS-2), and this dataset was generated by using spectral-based method.
The National Meteorological Satellite Center (NMSC) of the Korea Meteorological Administration (KMA) developed a cloud detection algorithm using GEOstationary KOrea Multi-Purpose SATellite 2A (GEO-KOMPSAT-2A, GK-2A), which was launched on 4 December 2018. The cloud detection algorithm developed by NMSC was mainly a spectral-based method, which was calculated by comparison with the results of Radiative Transfer for TIROS Operational Vertical Sounder (RTTOV) and the fine adjustment process by experts [16]. In addition to the spectral-based method, tests of spatial uniformity, inversion layer correction, and usage of top of atmosphere (TOA) reflectance in clear-sky conditions were executed and used to improve the accuracy [16]. However, in the case of TOA reflectance under clear-sky conditions, the probability of acquiring the minimum TOA reflectance under clear-sky conditions is low, especially in the summer season, because there are many periods of precipitation [16]. Furthermore, thresholds are sometimes unstable because of the effect of changes in atmospheric components and the composition of complex land surfaces.
Therefore, to improve the performance of cloud detection in the daytime, we propose a multitemporal image-based cloud detection algorithm by combining filtering techniques, which consist of angular variation and minimum TOA reflectance, and dynamic thresholds with near-infrared (NIR) and normalized difference vegetation index (NDVI) images. This study is organized as follows: Section 2 presents the input, comparison, and validation data used in this study, Section 3 describes the methodology of the cloud mask using GK-2A multitemporal images, and Section 4 provides the validation results (against Visible Infrared Imaging Radiometer Suite (VIIRS) and Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observation (CALIPSO) cloud products with the GK-2A spectral-based algorithm made by NMSC). Section 5 presents a discussion of the proposed algorithm, and Section 6 presents a summary and conclusion.

GK-2A
GK-2A is the second geostationary meteorological satellite for meteorological missions and space weather observation tasks in the Rep. of Korea. GK-2A was launched on 4 December 2018. GK-2A has a high-performance meteorological sensor named the Advanced Meteorological Imager (AMI). The AMI sensor has four visible channels, two shortwave infrared channels, four midwave infrared channels, and six thermal infrared channels. In addition, the AMI has a spatial resolution from 0.5 km to 2 km, and it captures imagery every 2 min around the Korean Peninsula and every 10 min for the full disk (FD) [15]. In this study, we used visible and near-infrared (VNIR) channels observed in East Asia that were clipped from the FD dataset, as shown in Figure 1.

GK-2A Cloud Mask
The GK-2A cloud mask developed by NMSC, as shown in Figure 2a, utilizes the conventional spectral-based algorithm by using a single channel and the difference between channels. A single channel-based threshold detects thick clouds by using shortwave channels (0.6, 0.8, 1.3, and 1.6 µm) and a longwave channel (10.4 µm). In the case of shortwave channels, the reflectance in clear-sky conditions is derived from the minimum value of the stacked dataset for 15 days. The clouds can be detected with a defined threshold calculated by the difference between the target TOA reflectance and clear-sky reflectance. In the case of a difference between channels, BTD is based on spectral characteristics by combining the channels from 3.9 µm to 13.3 µm [16], e.g., the BTD between 10.4 µm and 12.3 µm, which is organized by window channels, is for cirrus cloud detection [16]. When a cirrus cloud exists in the pixel, the value of BTD between 10.4 µm and 12.3 µm is higher than when there is a clear sky or thick cloud. Similar to the BTD between 10.4 µm and 12.3 µm, eight threshold tests using BTD were executed and used. Thresholds are defined by using RTTOV, and the analyst adjusts the threshold by checking the result [16]. In addition, a spatial homogeneity test is also used, and discontinuities are resolved considering temporal variability [16]. The spatial homogeneity test is based on the standard deviation in 3 × 3 pixels [16]. If 3 × 3 pixels are filled with only clouds or only clear-sky observations, the standard deviation is lower. However, when clouds and clear-sky observations are mixed in pixels, the standard deviation is higher. Therefore, the spatial homogeneity test is useful for checking cloud edges [16]. The GK-2A cloud mask has a problem, in which the probability of acquiring the minimum TOA reflectance under clear-sky conditions is low, especially in the summer season, because there are many periods of precipitation [16]. For this reason, we used the GK-2A cloud mask as a comparison dataset to demonstrate that the proposed algorithm improves the accuracy by solving this problem. The GK-2A cloud mask is available via NMSC webpage [30].

Suomi-NPP
The Suomi National Polar-orbiting Partnership (Suomi-NPP) was launched in October 2011 to demonstrate the performance of environmental data records and to provide continuity for data series observations initiated by NASA's EOS series missions (Terra, Aqua and Aura) [31]. The VIIRS equipped in Suomi-NPP has sixteen moderate resolution bands, five imaging bands, and a day-night band from 0.41 µm to 12.5 µm to observe the environment [15]. We used the VIIRS cloud mask (VCM) 'CLDMSK_L2_VIIRS_SNPP' as the validation dataset, which is available on the Earthdata webpage [32]. The VCM algorithm consists of various cloud detection tests grouped by surface type and solar illumination condition [33]. Each cloud detection test employs three thresholds: high cloud-free confidence, low cloud-free confidence, and a midpoint threshold [33]. Assuming three tests are applied to pixels, the overall cloud confidence is based on the cube root of the product of the probabilities for these three tests [33]. Based on this overall cloud-free probability, the VCM categorizes a pixel as confidently cloud, probably cloud, probably clear, or confidently clear. The VCM has a spatial resolution of 750 m and a temporal resolution of twice a day. Only daytime data were used among the provided VCM, and the spatial resolution was resampled to 2 km, which is in the same resolution as GK-2A cloud detection data, with nearest neighbor resampling method.

CALIPSO
CALIPSO was launched in April 2006 to observe the global distribution and properties of aerosols and clouds and is now flying in formation with the A-train constellation of satellites [34,35]. CALIPSO carries an Imaging Infrared Radiometer (IIR), a moderate spatial resolution Wide Field of-view Camera (WFC), and a Cloud-Aerosol LIdar with Orthogonal Polarization (CALIOP). Among the sensors, CALIOP has a pulse width of 20 ns and a pulse repetition rate of 20.25 Hz using a unique Nd:YAG laser pulse. CALIPSO detects the backscattered signal at wavelengths of 532 nm and 1064 nm [36]. In this study, we used cloud layer data with 1 km spatial resolution (Figure 2b), which is named 'CAL_LID_L2_01kmCLay-Standard, version 4.20 and is available on the Earthdata webpage [32]. In the dataset, cloud pixels in feature classification flags were used to validate the proposed algorithm. We made a collocation dataset between the GK-2A and CALIPSO cloud datasets via a K-dimension tree (KDTree) and nearest neighbor search (NNS) within 0.03 • , as shown in Figure 2b. The red line in Figure 2b is an example of the CALIPSO dataset.

Methodology
Owing to the high temporal resolution of the GK-2A, there is a large opportunity to obtain clear-sky reflectance even if there is an influx of clouds. In addition, variations in TOA reflectance due to cloud inflow and the occurrence of convective clouds can be identified. Since these advantages are useful for cloud masks, we developed a four-step cloud detection algorithm, as shown in Figure 3.
We primarily used 0.86 µm in the processes of filtering techniques and dynamic thresholds because 0.86 µm is widely used in various daytime cloud detection algorithms [13,16,37,38]. The other channels of 0.64, 1.38, and 1.6 µm were additionally used in the dynamic threshold section to improve the accuracy of the cloud mask. In case of 0.64 µm, we used this channel to calculate NDVI. All the imagery was resampled to 2 km using linear spline interpolation.

Preprocessing
The digital number (DN) of the GK-2A observation dataset was converted to TOA reflectance to facilitate identification of the data in each pixel and to minimize the effect of the solar zenith angle.
where Re f denotes the TOA reflectance, i and j are the pixel coordinates, λ is the wavelength, d is the distance between Earth and Sun in astronomical units, a and b are the gain and offset to convert DN to radiance, respectively, ESUN is the mean solar exoatmospheric irradiance, and SZA is the solar zenith angle. ESUN can be calculated by integrating the solar spectral irradiance IRR and the GK-2A spectral response function spectral response function (SRF). The solar spectrum to derive solar spectral irradiance (Wm −2 µm −1 ) is acquired from the National Renewable Energy Laboratory (NREL) webpage [39][40][41]. We used the synthetic Gueymard spectrum, which covers the spectral region from 0.4 µm to 1.7 µm in 1 nm steps and from 1.7 µm to 4 µm in 5 nm steps [40]. Because a difference in resolution between the solar spectrum and GK-2A SRF exists, we used linear interpolation.
The ESUN λ values of 0.6, 0.8, 1.3, and 1.6 µm were calculated and are shown in Table 1.

Filtering Technique by Angular Variation
Reflectance represents the ratio of reflected radiance to the incident flux at a specific surface. Reflectance under clear-sky conditions is almost unchanged if the effect of the change in solar zenith angle is corrected and the effects of atmospheric components (scattering, absorption) are weak. As shown in Figure 4a, red and blue lines (dot) indicate the variation in reflectance observed under clear-sky conditions, and orange and cyan lines (square) indicate that clouds are approaching or flowing away from a specific pixel. When clouds approach a specific pixel, the reflectance of the pixel gradually increases. By using these characteristics, we propose the time-variant filtering technique by using angular variation to detect and mask cloud pixels. If there is a pixel located in a clear sky, change in reflectance over time should be approximately 45 degrees. However, since there are changes in reflectance due to variations in surface characteristics and atmospheric components that cause the scattering and absorption of radiance, the angle of reflectance varies within a certain range. As shown in Figure 4b, the upper figure shows the variation in angle in clear-sky reflectance, which is located in specific pixels of land and sea. The lower figure presents the change in angle caused by clouds. Regular angle variation can be checked in clear-sky observations through Figure 4b. The angle variation is more stable on land than on sea. Therefore, we set the threshold from 44 • to 46 • for both land and sea, and the range of thresholds was selected by finding the angular change in several pixels in clear-sky observations. The filtering technique is inspired by Graham's scan algorithm, which is a method to find the convex hull of finite points on a plane.
where θ denotes an angle, t is the acquisition time of imagery, ∆t is the time step, and we set 10 min as the time step.

Filtering Technique Using Minimum TOA Reflectance
If clouds continue to flow into the area, the filtering technique of angular variation cannot detect and mask the clouds. This is because high reflectance in the pixel is maintained. In other words, even though the pixel is located in clouds, the angle obtained as the time changes remains within 44 • to 46 • , similar to the clear-sky observation. Therefore, we solve this problem by using minimum TOA reflectance [42,43]. Specifically, we calculated the minimum TOA reflectance with a stacked dataset for seven days, and we calculated the difference between the target and minimum TOA reflectance. The cloud pixels were detected and masked by using a threshold calculated in advance. In this process, if only the minimum TOA reflectance was used to filter the clouds without using a threshold, the variation in reflectance due to the changes in surface characteristics and atmospheric components could not be considered. To prevent this problem, we used a threshold. To calculate the threshold, we need to fully separate cloud pixels from clear-sky observations in time series graphs. Therefore, we revised and used the cloth simulation filtering (CSF) method [44]. CSF was originally developed to find the shape of terrain, which is a digital surface model (DSM), by dropping a virtual cloth down on an inverted (upside-down) point cloud [44]. CSF consists of two steps: 'displacement by gravity' and 'internal force'. Of these steps, we only used the displacement by gravity.
where X t denotes displacement by gravity, Re f min represents the minimum TOA reflectance, and it is calculated by accumulating all imagery for seven days before the target date; m is the mass of the particle, which is usually set to 1, and G is the gravitational constant. We reverse the dataset by using (1 − Re f ) × 10 and use Equation (4). The red dots shown in Figure 5 are the results of the calculated displacement by gravity. As shown in Figure 5a,c, if clouds which were not masked by filtering by angular variation exist, the difference from the minimum TOA reflectance (highest point, blue dot) is higher when using displacement by gravity than when using only the observed reflectance. Therefore, displacement by gravity used in this study amplifies the difference between the minimum and target TOA reflectance, as shown in Figure 5. To calculate the threshold, we stacked the TOA reflectance dataset for 30 days in June 2020 and used the Otsu multilevel threshold method with three classes, as shown in Figure 6 [45,46].
where k denotes the clusters, ω i indicates the cumulative probability, µ k indicates the mean gray level for each cluster, µ T indicates the mean intensity of the whole image, and σ 2 B denotes the between-class variance. The optimal threshold can be determined by maximizing the between-class variance [46]. Among the thresholds, the closest to the cloud pixel value of 4.237, as shown in Figure 6, was used.

Dynamic Threshold
Because of atmospheric blocking and the occurrence of stationary fronts due to the expansion of the North Pacific high, pixels where clouds are constantly maintained exist in the image. As shown in Figure 7a, the pixels with remained clouds that are not masked when using the proposed filtering techniques mentioned above appear to have higher TOA reflectance than other pixels. Since these cloud pixels are less distributed among the whole image, we set the threshold value through the Otsu multilevel threshold method [45,46], which is used in Section 3.3. The number of classes was set to 4, which was determined empirically. Additionally, to improve the accuracy of the proposed algorithm, the NIR and NDVI images applied with cloud mask, which was result of the proposed filtering techniques, were also used in this study with the Otsu method, as shown in Figure 7. Among the thresholds, the closest to the cloud pixel value was used. When a dynamic threshold was applied to the channels, we divided the land and sea areas by using annual International Geosphere-Biosphere Programme (IGBP) classification data of MCD12Q1 [47], which is the MODIS Terra and Aqua combined land cover product. After reprojection of MCD12Q1 data to Lambert Conic Conformal (LCC), only the data over the water bodies were used.

NDVI
The NDVI is a frequently used index that expresses the status of plant health. In particular, the NDVI has been useful in analyzing temporal changes in vegetation [45]. The range of the NDVI is from −1 to 1. If the pixel value approaches 0, there is a small difference between the observed radiances of the red and NIR channels. A pixel value near zero indicates sand, clouds, and rocks [48][49][50]. Therefore, if the NDVI is applied to land to classify cloud pixels, it is difficult to be used because the TOA reflectance of cloud is similar to that of the desert [51]. For this reason, only the NDVI in the sea was used in this study.
where R 0.86 and R 0.64 indicate the TOA reflectance of 0.86 µm and 0.64 µm, respectively, which are calculated by Equation (1).

Near-Infrared
We used 1.38 µm and 1.6 µm in the dynamic threshold part. The 1.38 µm (from 1.36 µm to 1.39 µm) channel is located in strong atmospheric water absorption. It is observed by radiance mainly reflected by the troposphere because the upward radiance from the surface or the lower-middle atmosphere is blocked by water vapor in the upper atmosphere [52]. Therefore, 1.38 µm has been used for decades in MODIS to detect cirrus clouds even over polar regions where the atmosphere is extremely dry [13,23]. The threshold to detect thick cirrus clouds is set to 0.02 globally in MODIS and Landsat-8 cloud products [23]. In the case of the MODIS cloud product, separate thresholds are divided according to the existence of snow and confidence levels [12,23]. In this study, we set the initial global threshold to 0.02 and then adjust the threshold by applying the Otsu method because the use of the global threshold misses some thin cirrus clouds.
In the case of 1.6 µm, because this channel has low reflectance in snow and ice particles, it can be used to discriminate between snow/ice and water [25,53,54]. In this study, 1.6 µm, which is clipped on the sea, was used only to improve the classification accuracy between water in the sea and ice clouds.

Results
As the solar radiance reaching Earth gradually increases from June to August, the North Pacific high shifts to the north, and low pressure occurs in East Asia [55]. On the other hand, dry air from the north acts as a kind of 'invisible wall' that blocks the movement of moist air, and as a result, a large amount of water vapor in the atmosphere falls as rain [55]. Therefore, in general, precipitation increases in southeastern China and Japan in mid-June, increases in South Korea in late June, and then moves north to North Korea [55]. In July 2020, the extreme values of average temperature, maximum temperature, and the number of heatwave days were updated in the Rep. of Korea. At the end of June, since the North Pacific high was located in the southeastern Rep. of Korea, the low pressure that developed in southern China approached, the southwest wind strengthened, and a large amount of water vapor was introduced.
As the North Pacific high is maintained until the first half of September, it shows a form of an atmospheric pressure pattern in summer, causing a temporary high temperature phenomenon [56]. In October, a migratory anticyclone develops in inland of China and moves eastward, generating clear and dry air [56]. Therefore, cirrus, cirrostratus, and cumulus clouds are primarily created. In case of October 2020, the seven typhoons were occurred in East Asia due to continuance of La Niña [57]. This is the unusual natural phenomena compared with the occurrence of three typhoon in October of normal year (1981 to 2010) [57]. Therefore, we select June and October 2020 as the date to compare and validate the proposed algorithm because various cloud types can be found.
We conducted a qualitative comparison and quantitative validation with GK-2A and LEO cloud products. For the qualitative comparison, the true color image and natural color image, which were made by using the RGB recipe of NMSC, were used [52]. For the quantitative validation, we used the confusion matrix as shown in Figure 8, and the precision, recall, accuracy (hit rate), false positive rate (FPR), and F1score were calculated to compare the proposed algorithm with LEO products. The precision and FPR are also referred to as the probability of detection (POD) and false alarm rate (FAR), respectively [15,58].
where true positive (TP) means the number of pixels when validation data was cloud, and proposed algorithm was set to cloud. False positive (FP) means the number of pixels when validation data was clear observation, and proposed algorithm was set to cloud. False negative (FN) means the number of pixels when validation data was cloud, and proposed algorithm was set to clear observation. True negative (TN) means the number of pixels when validation data was clear observation, and proposed algorithm was set to clear observation, as shown in Figure 8. The precision indicates how accurate the masked cloud pixel from GK-2A is. In other words, it shows how many pixels that are actual clouds are included in the GK-2A cloud detection results. The recall indicates how well the cloud was found without dropping it. The accuracy is an indicator of how accurately the proposed algorithm detects clouds. The FPR is the probability of falsely rejecting the null hypothesis. The F1score calculated by the harmonic mean represents the difference between precision and recall. If the difference between the two values increases, the F1score increases, and if the difference between the two values decreases, the F1score decreases.

Qualitative Comparison
As a qualitative comparison of the proposed algorithm, the true color and natural color images, which are made by GK-2A images, are used. The true color RGB is designed close to the human eye and is made by using visible channels (0.47 µm, hybrid green, 0.64 µm) applied with histogram equalization. Hybrid green is calculated by using 0.51 and 0.86 µm [52]. The natural color RGB is designed to distinguish ice from the water phase and is made by using visible channels (0.64 µm, 0.86 µm) and a NIR channel (1.6 µm) applied with histogram equalization [52].
The only 3 dates of June 2020 were used as shown in Figure 9. The area colored grey in Figure 9c,i is removed by the restriction of the solar zenith angle. In the case of 1 June 2020 00 UTC, as shown in Figure 9a-c, the proposed algorithm can detect convective clouds occurring over the Shandong Peninsula (37.175 • N, 121.297 • E) of China and cirrus clouds over the Yellow Sea and East Sea of the Rep. of Korea, in addition to detecting the overall clouds. In the case of 08 June 2020 03 UTC, as shown in Figure 9d-f, the stationary front located at 20 • N is well detected, and the cirrus clouds located on land (40 • N, 90 • E) are also well detected. As shown in Figure 9g-i, most of the clouds, including cirrus, were well detected. However, the cloud detection accuracy is decreased in the area where sunglint occurs (Figure 9c,f). When sunlight is reflected off the sea at the same angle as the satellite sensor detects the surface, sunglint occurs. Therefore, the area where sunglint occurs varies over time. Since the proposed algorithm is based on multitemporal images, error occurs due to phenomena that change with time, such as sunglint. The only 3 dates of October 2020 were used as shown in Figure 10. In the case of 2 June 2020 01 UTC, as shown in Figure 10a

Validation with the VIIRS Cloud Product
The VIIRS cloud product was collected for 60 days from 1 to 30 in June and October 2020 to validate the proposed algorithm. The VIIRS cloud product distinguishes between confident and probably clouds according to the threshold test. The VIIRS cloud mask sets three thresholds between high confidence cloudy and high confidence clear and uses these thresholds to determine the confidence flag [13]. According to the validation score of the VIIIRS cloud mask, the overall hit rate (%) was 88.4% in July 2007 and 89% in October 2007 [59].
In this study, confident and probably clouds were classified and validated with the proposed algorithm and GK-2A cloud mask. The GK-2A cloud mask was originally developed by NMSC and is available via webpage [60]. The GK-2A cloud mask also provides clouds as confident and probably by using threshold tests similar to the VIIRS cloud detection algorithm. Therefore, when only confident clouds in VIIRS products were used, only confident clouds in GK-2A cloud masks were used, and the same was applied when using probably clouds.
The result of validation with the VIIRS cloud product is shown in Table 2, and when VIIRS with confident and probably clouds was used, the accuracy of both the multitemporal and spectral-based algorithms was higher. In the case of the spectral-based algorithm, the precision, recall, accuracy, and F1score significantly increased. However, when compared with the proposed multitemporal algorithm, the degree of agreement with the VIIRS cloud product is significantly different as difference of the accuracy is 0.15, and the F1score is 0.1 as mean value between June and October. In the case of the recall value in the comparison between the proposed algorithm and VIIRS with confident and probably clouds, as shown in Table 2, the result was close to 1. The VIIRS cloud data and the proposed GK-2A cloud detection algorithm are almost identical. In addition, since the F1score of the proposed algorithm represents 0.92 (June and October), the proposed algorithm has good performance in detecting clouds.

Validation with the CALIPSO Cloud Product
In this section, we compare the CALIPSO cloud product and proposed algorithm, as shown in Table 3. The CALIPSO cloud product was collected for 60 days from 1 to 30 June and October 2020 to validate the proposed algorithm. In the case of the GK-2A (NMSC) cloud mask, when the dataset used both confident and probably clouds, the accuracy was slightly higher in Section 4.2. Therefore, we used both confident and probably clouds in this section. As a result of validation with CALIPSO, the proposed algorithm showed a high accuracy of 0.92, which is mean value ( Table 3). The all of precision, recall, and F1score also showed high values of about 0.9 or higher (Table 3). Compared to GK-2A (NMSC) cloud detection, the proposed algorithm is superior based on the overall indexes. Therefore, these results indicated that the proposed algorithm detects most of the clouds with good performance. Additionally, we conducted an accuracy assessment according to cloud types, which is provided from 'Feature_Classification_Flags' of the CALIPSO cloud product. Eight cloud types (feature subtype) were classified in the CALIPSO cloud dataset. We evaluated whether the proposed algorithm detects each cloud type well, as shown in Figure 11. For validation by cloud types, we used only the recall because the pixels, which are located in clear-sky conditions or classified by aerosols, were set to not-a-number (NaN) values in the cloud product. As the results of validation by cloud types, recall values of 0.96 or higher are shown in Figure 11. Among the eight cloud types, low (broken cumulus) and altocumulus (transparent) had the lowest recall values of 0.96. Since the two types of clouds are smaller than the other clouds, the proposed algorithm classified a few pixels with these two types of clouds as clear sky. However, since the overall result had a very high score, there seems to be no loss of accuracy depending on cloud types.

Discussions
The proposed algorithm has four steps to detect clouds. Each step must be in harmony with the others to derive the results shown in Tables 2 and 3. The harmony of each step can be checked with a simple accuracy test for each step. We divided algorithms (1) with all steps, (2) that used only filtering techniques by using angular variation and minimum TOA reflectance, and (3) that used only the dynamic threshold for each GK-2A channel. The test dataset to check the accuracy of each step is set to 30 days (1 to 30 June 2020) of collocated images between GK-2A and VIIRS, as shown in Table 4. When only filtering techniques were used, the accuracy and F1score decreased by approximately 0.05 and 0.03, respectively. Additionally, when using only the dynamic threshold for GK-2A NIR and NDVI images, the accuracy was decreased by approximately 0.4. These results mean that all the steps are operating harmoniously. As mentioned above, if only filtering techniques were used, abnormal TOA reflectance was revealed due to the existence of clouds on stacking days. In addition, if images not to be applied with filtering methods were used in the dynamic threshold step, the threshold values in each channel used in this study were irregular. This is because the Otsu threshold method found the point of inflection in each image histogram. In other words, when an image is applied with filtering techniques, the rest of the cloud can be easily found because there is a gap between the surface and the remaining cloud. Regarding thresholds used in filtering techniques of proposed algorithm, it is effective even if the thresholds were applied to images of other seasons. In other words, the thresholds were calculated from images acquired for one month of June 2020. When the thresholds were applied in other images acquired in October 2020, the satisfactory results were obtained as shown in Tables 2 and 3. It can be shown as an advantage of proposed algorithm, because the problems that arise when applying this algorithm in other seasons can be flexibly captured through the fusion of filtering techniques and dynamic threshold.
Although the accuracy of the proposed algorithm of the GK-2A cloud mask was found to be reliable according to statistical values, this algorithm involves some issues due to sunglint and desert areas. In case of sunglint, Figure 9c,f show the areas misclassified as clouds, although they are under clear-sky conditions. As mentioned above, the problem was a change in sunglint area. When the sunglint area moved from east to west over the daytime, the reflectance on the sea fluctuated. Since it is caused by angular variation in specific pixels, error in the sunglint area occurs. In case of desert, some area was misclassified as clouds because of dust storm and the effect of solar zenith angle in desert, as shown in Figure 10f. Therefore, additional approaches of GK-2A multitemporal cloud detection should be considered.

Summary and Conclusions
We propose a cloud detection algorithm with GK-2A multitemporal and multispectral images during the daytime. The steps are to (1) find and mask the cloud pixels if TOA reflectance changes greatly as time changes; (2) find and mask the remaining cloud pixels by using the difference with minimum TOA reflectance; and (3) find and mask the remaining cloud pixels by additionally using NIR and NDVI images applied with a dynamic threshold.
All the steps in this study operate in harmony. We evaluated the reason why three steps are needed to detect clouds. In addition, we evaluated whether the proposed algorithm is very well suited to classify clouds, as validated with frequently used CALIPSO and VIIRS cloud products. The validation result indicates that the proposed cloud detection algorithm detects most of the clouds with good performance. Although the cloud detection algorithm obtains satisfactory results, the misclassification occurs in sunglint and desert areas. Because the proposed cloud detection algorithm is based on multitemporal data, the variance in TOA reflectance affected by sunglint and the remained effect of solar zenith angle in desert area cause misclassification. In the future, we will try to develop an additional algorithm to solve the misclassification.