Next Article in Journal
VQ-InfraTrans: A Unified Framework for RGB-IR Translation with Hybrid Transformer
Previous Article in Journal
Characteristics of Inter-System Bias between BDS-2 and BDS-3 and Its Impact on BDS Orbit and Clock Solutions
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Cloud-Type Classification for Southeast China Based on Geostationary Orbit EO Datasets and the LighGBM Model

1
Collaborative Innovation Center on Forecast and Evaluation of Meteorological Disasters, CMA Key Laboratory for Aerosol-Cloud-Precipitation, Nanjing University of Information Science & Technology, Nanjing 210044, China
2
School of Atmospheric Physics, Nanjing University of Information Science & Technology, Nanjing 210044, China
3
Department of Geography, Harokopio University of Athens, EI. Venizelou 70, Kallithea, 17671 Athens, Greece
4
Guangxi Meteorological Observatory, Nanning 530022, China
5
Qinghai Provincial Meteorological Service Centre, Xining 810001, China
*
Author to whom correspondence should be addressed.
Remote Sens. 2023, 15(24), 5660; https://doi.org/10.3390/rs15245660
Submission received: 21 October 2023 / Revised: 1 December 2023 / Accepted: 4 December 2023 / Published: 7 December 2023
(This article belongs to the Section Atmospheric Remote Sensing)

Abstract

:
The study of clouds and their characteristics provides important information for understanding climate change and its impacts as it provides information on weather conditions and forecasting. In this study, Earth observation (EO) data from the FY4A AGRI and Himawari-8 CLP products were used to classify and identify distinct cloud types in southeastern China. To reduce the impact of parallax between geostationary satellites, we proposed adopting a sliding detection method for quality control of cloud-type data. Additionally, the Bayesian optimization method was employed herein to tune the hyperparameters of the LightGBM model. Our study results demonstrated that Bayesian optimization significantly increased model performance, resulting in successful cloud-type classification and identification. The simultaneous use of visible and shortwave infrared channels, and brightness temperature difference channels, enhanced the model’s classification performance. Those channels accounted for 43.79% and 21.84% of the overall features, respectively. Certainly, the model in this study outperformed compared with the traditional thresholding method (TT), support vector machine (SVM), and random forest (RF). Results showed a model prediction accuracy of 97.54%, which was higher than that of TT (51.06%), SVM (96.47%), and RF (97.49%). Additionally, the Kappa coefficient of the model was 0.951, indicating the model’s classification results were consistent with the true values. Notably, this performance also surpassed TT (0.351), SVM (0.929), and RF (0.950).

1. Introduction

Clouds have a significant role in the evolution of weather, climate, the global water cycle, and the global radiative budget with an average annual coverage of more than 60% of the Earth [1,2,3,4,5]. Different cloud types provide crucial cues for weather analysis and forecasting by reflecting various atmospheric states and circulation scenarios. Cloud types have clear indications of weather conditions and weather changes. For example, stratocumulus and nimbostratus are low clouds with clear indications of stratiform precipitation, cumulus, and cumulonimbus are effective indicators of convective precipitation, and cirrus clouds are generally not accompanied by precipitation [6]. In addition, the cloud classification products are important for the accuracy improvement of the retrieval results of meteorological parameters, such as cloud base height and precipitation intensity [7,8,9].
The southeastern region of China is located in the obvious monsoon zone. The abundant water vapor, as well as the power and thermal effects of the Tibetan Plateau, provide favorable conditions for the formation of clouds and precipitation [10,11,12]. The strength of the East Asian summer monsoon circulation affects the weather and climate of the region in significant ways. The seasonal rain belts, the beginning and end of the rainy season, and the structure of rain patterns are closely related to the East Asian summer monsoon [13]. Droughts and floods triggered by changes in regional precipitation seriously affect the production, life, and economic activities of local people. Furthermore, strong convective clouds may also trigger strong convective weather such as thunderstorms and gales, short-term heavy precipitation, and hail [14], resulting in serious economic losses and even casualties. Therefore, it is of great significance to achieve accurate cloud-type classification in the southeastern region of China.
Clouds are classified by meteorological operations into three families and ten genera based on their macro-structural features, including cloud base height, morphology, and structure. This method is considered the most effective way to distinguish cloud types artificially [6]. However, artificial identification has a certain degree of subjectivity, which will have a certain impact on the identification accuracy, and it is not possible to carry out continuous observation over a wide area. Geostationary satellites have the capability to carry out uniform and continuous observation of the same target area for a long time, making them an effective means of cloud observation. However, the traditional classification standard for ground-based cloud observations is not suitable for satellite data due to the different observation angles. The International Satellite Cloud Climatology Project (ISCCP) proposes a set of cloud classification standards [15], which classifies clouds into nine categories according to the cloud top pressure and optical thickness. The cloud classification product data specifically from the Himawari-8 geostationary satellite is based on the ISCCP standard.
Currently used cloud detection and classification algorithms can be classified into three categories: simple methods, statistical methods, and artificial intelligence methods [16]. The threshold-based classification is the most commonly used simple cloud classification method [17,18,19], which occupies less computational resources but has low accuracy. Statistical methods have higher accuracy compared to simple methods. However, those methods also require more computational power, such as K-means [20] and maximum-likelihood estimation [21]. With the recent technological developments, more and more research is focusing on the use of artificial intelligence in mapping clouds and their properties. Artificial intelligence is a method of using computers and machines to mimic the way the human brain thinks to solve problems and make decisions, and machine learning is a subfield of artificial intelligence. Traditional machine learning requires manual extraction of feature parameters for model training. Classical machine learning-based methods train models to classify cloud types by manually selecting feature parameters, such as brightness temperature, and texture. Such methods, for example, Support Vector Machines [22] and random forests [21,23], have demonstrated that machine learning methods can be successfully used for cloud detection and classification tasks.
The FY-4A geostationary satellite has no product based on the ISCCP cloud classification standard, but some researchers have already conducted research on FY-4A based on the cloud-type classification product of Himawari-8 [24,25]. Nevertheless, they mostly mapped the Himawari-8 satellite data directly to FY-4A data without considering the inter-satellite parallax effect, which may lead to the wrong matching of cloud-type labels. For supervised learning, label mislabeling will have a great impact on the model accuracy. Therefore, it is an imperative step to perform quality control on labeled data before constructing the dataset to minimize matching errors and the impact of data quality on the model.
Among machine learning algorithms, LightGBM has become a research hotspot in recent years due to its fast training speed, low resource consumption, strong generalization ability, and it is not prone to overfitting. LightGBM is a decision tree-based gradient-boosting algorithmic framework proposed by Microsoft in 2017 [26]. It is widely used in various fields such as geomatics [27], Earth Observation (EO) [28], and meteorology [29].
In the purview of the above, the present study aimed to propose a cloud-type classification retrieval based on the LightGBM model. Currently, there have been several studies on the development of AGRI cloud classification products, but most of them do not consider the effect of parallax. This is where the contribution of this study lies. A sliding detection method was used for the quality control of the cloud-type data and to reduce the effect of parallax on the model. Bayesian optimization was used to optimize the model hyperparameters. Furthermore, comparisons were made with other cloud classification methods to assess the accuracy of the retrieval model in this paper.

2. Data and Methods

2.1. Data

2.1.1. FY-4A Dataset

FY-4A is a Chinese new-generation geostationary meteorological satellite, launched on 11 December 2016. The on-board Advanced Geostationary Radiation Imager (AGRI) includes 14 channels capable of providing minute-level observations. Compared with the previous generation of geostationary meteorological satellites, the Feng Yun 2 (FY-2) series of VISSR, AGRI has greatly improved in terms of spatial and temporal resolution and the number of observation channels [30]. The observation parameters of each channel are shown in Table 1.
The AGRI full disk data from June to August 2022 were used in this paper. The temporal resolution of the data was 15 min and the spatial resolution was 4 km. The data were provided by the National Meteorological Satellite Centre (http://satellite.nsmc.org.cn/ (accessed on 13 March 2023)). Since the Himawari-8 CLP product only has data during the daytime hours, we only selected the data for the whole point and half point, with a time range of 00-09UTC. We set the southeast coastal area of China as the study area, with a longitude and latitude range of 105°~123°E, 15°~35°N. The study area is shown in Figure 1.

2.1.2. Himawari-8 Dataset

The Himawari-8 satellite, launched on 7 October 2014, is a new-generation geosynchronous satellite developed by the Japan Aerospace Exploration Agency (JAXA). The Advanced Himawari Imager (AHI) is one of the main payloads of the Himawari-8 satellite, with a spatial resolution of 0.5–2 km and a temporal resolution of 10 min. The AHI sensor has 16 bands and provides observational support for Japan, East Asia, and the entire western Pacific region with a temporal resolution of 10 min. JAXA generates a wide range of geophysical observations using Himawai-8 standard data, and the cloud-type data used in this paper were provided by JAXA’s P-tree system (https://www.eorc.jaxa.jp/ptree/index.html (accessed on 23 March 2023)) for model parameter acquisition and accuracy validation. The CLP product consisted of parameters such as cloud optical thickness, cloud effective particle radius, cloud top temperature, cloud top height, cloud type defined by ISCCP, and quality control information, with a data resolution of 0.05°, which is in good agreement with the MODIS product [31]. According to the ISCCP standards, the cloud types included cirrus (Cirrus, Ci), cirrostratus (Cs), deep convection (Deep Convection, Dc), altocumulus (Altocumulus, Ac), altostratus (Altostratus, As), and nepheloid (Nimbostratus, Ns), Cumulus (Cu), Stratocumulus (Sc), and Stratus (St) in nine categories, as shown in Figure 2. The Himawari-8 satellite cloud-type product is shown in Figure 3.

2.2. Data Pre-Processing

2.2.1. AGRI Data Preprocessing

In order to facilitate the calculation, data pre-processing of the L1 data was required. Firstly, the data was radiometrically calibrated, and DN values were converted to either albedo or cloud top brightness temperature based on the CAL Channel calibration table. The FY-4A satellite uses a nominal projection of the geostationary orbit as defined by the CGMS LRIT/HRIT global specification, which requires the calculation of latitude and longitude based on the WGS84 reference ellipsoid after radiometric calibration. After that, the data were interpolated onto a 0.05° × 0.05° grid of equal latitude and longitude, which was consistent with the resolution of the CLP product. Different light source conditions at different moments can cause inconsistencies in the brightness of the visible (VIS) image, and to eliminate this effect, the sun altitude angle was revised to ensure that the radiance of the pixels is consistent at different sun altitude angles. In addition, the data ranges of each channel were different, and to facilitate the training process, the data of different channels needed to be normalized by using the maximum-minimum processing method so that all the data were between 0 and 1. The calculation method i as follows:
x = x - min max - min
where x is the data obtained after normalization, x is the original data of each channel, min is the minimum value of each channel, and max is the maximum value of each channel.

2.2.2. Cloud Type Label Matching

The parallax effect needed to be taken into account when observations from instruments on both geostationary satellites, FY-4A and Himawari-8, are used jointly. Parallax is the deviation between the satellite observation target position and the true position, it is prevalent in non-vertical observations [33]. When the satellite observation target has a large viewing angle, the cloud target position measured by the satellite is not the true cloud position, but will be shifted in the direction away from the sub-satellite point, and this error is especially noticeable for samples far away from the sub-satellite point. As shown in Figure 4, When both AGRI and AHI observe the same target position, point P, O1 represents the observation cloud position for AGRI, while O2 represents the observation cloud position for AHI. The parallax between the two satellites is represented by d. In this case, the radiation obtained from the observation at point P corresponds to the observation at point O1, and the cloud-type label obtained from the observation at point P corresponds to the observation at point O2. To reduce the possible matching errors, the CLP product needed to be quality controlled by considering only the samples inside the cloud body and ignoring the samples at the edges to reject the incorrectly matched cloud labeling data.
In Figure 4, α and θ are the zenith and azimuth angles of AGRI, β, and γ are the zenith and azimuth angles of AHI, a is the parallax of AGRI, b is the parallax of AHI, d is the parallax between the two satellites, and h is the cloud height. Based on the latitude and longitude of O1 and O2 and the position of the satellite’s subsatellite point, the zenith angle and azimuth of each point can be calculated. With the known cloud height, d can be calculated [33]:
a = h tan α
b = h tan β
O 1 P O 2 = γ - θ
d = h 2 tan 2 α + tan 2 β 2 h 2 tan α tan β cos ( γ - θ )
It can be observed from the equation that inter-satellite parallax depends on the zenith angle, azimuth angle, and cloud top height. When FY-4A and Himawari-8 observe jointly, the parallax between the two satellites can be calculated by knowing the cloud top height, satellite zenith angle, and satellite azimuth angle. The cloud top heights in the southeast region of China are relatively low, mostly medium-low clouds, and the average cloud top heights are mostly between 7–10 km [30]. According to Equations (2)–(5), the actual parallax between the two satellites can be up to 12 km (about 3 pixels), and thus the 7 × 7-pixel matrix sliding detection method was used to cull the data. The step size was 1. When all cloud types within 7 × 7 pixels were consistent, it was considered that the cloud type label of the center pixel could be matched correctly; otherwise, the data of this center pixel point was rejected.
Ensuring the temporal and spatial correspondence between the input feature variables and the real target is necessary during model training and testing. Therefore, after quality control of cloud-type data, we need to perform spatio-temporal matching between AGRI images and CLP products. Through data preprocessing, we reprojected the AGRI data onto a 0.05 × 0.05° grid of equal latitude and longitude, which is consistent with the spatial resolution of the CLP product. Therefore, we selected data files with the same observation time and directly mapped the processed CLP product onto the AGRI image to complete the data matching. Taking 1 July 2022 at 06:00 UTC as an example, Figure 5 shows the cloud types and corresponding AGRI cloud images for the ISCCP standard after time-space matching. Additionally, we need to remove pairs with incomplete features and missing cloud-type labels to construct the dataset.
Ultimately, the AGRI and AHI dataset covering June–August 2022 was created, labeling over 9,900,000 pixels. We randomly selected 80% of the data in the dataset as the training set and the remaining 20% as the testing set. They were used for model training as well as validation, respectively.

2.3. LightGBM

LightGBM is a distributed boosting algorithm framework based on the decision tree algorithm, which is an efficient implementation of the gradient boosting decision tree (GBDT) algorithm. GBDT is widely used for classification, regression, sorting, and other tasks due to its high efficiency, accuracy, interpretability, and strong generalization ability. It is applicable to various classification problems and is not prone to overfitting. It takes the negative gradient of the loss function as a residual approximation of the current decision tree, goes on to fit a new decision tree, and accumulates the results of multiple decision trees as the final predicted output. The objective function can be expressed as:
Obj ( t ) = i = 1 n L ( y i , y i ^ ( t - 1 ) + f t ( x i ) ) + i = 1 t Ω ( f i )
Ω ( f i ) = γ T + λ 1 2 j = 1 T w j 2
where L is the loss function, which measures the difference between the true value y i and the predicted value, y ^ ( t - 1 ) is the result of the (t − 1)-th iteration, Ω ( f i ) is a regular term, γ can control the number of leaf nodes, T is the number of leaf nodes, and w is the number of leaf nodes. The objective function is to find a suitable tree f t that minimizes the value of the function.
We used the Taylor formula to expand the objective function to the second order:
Obj ( t ) = i = 1 n [ L ( y i , y i ^ ( t - 1 ) ) + g i f t ( x i ) + 1 2 h i f t 2 ( x i ) ] + i = 1 t Ω ( f i )
g i = L ( y i , y i ^ ( t 1 ) ) y i ^ ( t 1 )
h i = 2 L ( y i , y i ^ ( t - 1 ) ) y i ^ ( t - 1 )
With the problem that traditional GBDT causes a huge overhead in time and space, which is caused by the pre-sorting algorithm to find the data splitting points, LightGBM proposes the gradient-based one-sided sampling (GOSS) and exclusive feature bundling (EFB) techniques. The techniques optimize the data with large sample sizes and high feature dimensions, respectively, and prove on several public datasets that LightGBM improves the training speed by more than 20 times while achieving similar accuracy to GBDT [26]. LightGBM uses a Leaf-wise growth strategy to find one leaf at a time with the largest splitting gain from all the current leaves and then split it. With the same number of splits, Leaf-wise can reduce more error than Level-wise and have better accuracy. The schematic is shown in Figure 6. In this paper, we proposed a cloud classification model based on the LightGBM algorithm, implemented with Python using lightgbm package version 3.3.5.

2.4. Bayesian-Optimization

Bayesian optimization is a powerful strategy for finding the extremes of an objective function [34,35]. When searching for parameters that can maximize the objective function globally, existing prior information is taken into account to better tune the current parameters. Compared with other hyperparameter optimization schemes, like grid search and random search, Bayesian optimization is faster and can effectively improve the efficiency of finding hyperparameters. In this paper, we optimized the parameters of the LightGBM algorithm under the Bayesian framework.
Bayesian optimization uses a Gaussian process to fit the optimization objective function with reference to the previous parameter information and continuously updating the prior knowledge. The objective function for the hyperparametric optimization of the LightGBM model can be expressed as [36]:
x = argmax x χ f ( x )
where x is the hyperparameters of the model to be optimized, χ is the search space of the hyperparameters, and f(x) is the objective function between the hyperparameters and the model performance. Using Bayesian optimization aims to find the appropriate combination of hyperparameters to maximize the model performance. The general procedure of Bayesian optimization for iterative search of hyperparameter combinations is as follows:
The loop iterates t times (t = 1,2,3…):
  • According to the maximized acquisition function α ( x ) , to find the next set of possible evaluation points, x t + 1 = argmax x χ α ( x ; D 1 : t ) .
  • Calculate the corresponding model performance based on the evaluation points x t + 1 .
  • Update x t + 1 to the previous observation D 1 : t + 1 = { D 1 : t , ( x n + 1 , f n + 1 ) } , and update the probabilistic surrogate model.
The loop ends and the final combination of hyperparameters.

2.5. Evaluation Indicators

Several previous studies have used the same metrics to evaluate the classification model, including Precision (P), Recall (R), F1 Score (F1), Accuracy (Acc), and Kappa Coefficient (Kappa) [37,38,39,40]. The formula for each metric is as follows:
P = TP TP + FP
R = TP TP + FN
F 1 = 2 × P × R P + R
Acc = TP + TN TP + TN + FP + FN
Kappa = N i = 1 n x ii i = 1 n ( x i + x + i ) N 2 i = 1 n ( x i + x + i )
TP is the number of pixels whose true sample category is class a and the recognition result is also class a, TN is the number of pixels whose true sample category and recognition result are both other categories, FP is the number of pixels that should have been identified as class a but were incorrectly identified as other categories, and FN is the number of pixels that should have been identified as other categories but was incorrectly identified as class a. N is the total number of samples, x ii is the number of correctly recognized sample size, and x i + , x + i are the sum of the sample size of each row and column of the confusion matrix, respectively.
The above precision, recall, and F1 scores are only applicable to binary classification problems. For multiclassification problems, the Macro Average can be used as an evaluation metric to assess the classification performance of the model in all classes. Macro Average is the arithmetic mean of the performance metrics for each category, which can clearly show the recognition effect of different categories of samples under the unbalanced state.

2.6. Experimental Setup

The flowchart in Figure 7 illustrates the main content of this paper. It is mainly divided into two parts: data preprocessing and model training and evaluation. The data preprocessing stage, including the preprocessing of AGRI data, the quality control of CLP products, and the dataset construction, will be described in detail in this section. The model construction and evaluation will be given in Section 3.
In order to find the optimal channel combination for cloud classification, we designed three sets of plans to verify the effects of different channel combinations on the experimental results, and the specific channel combination plans are shown in Table 2. The input data of Plan 1 was 7–14 channel brightness temperature, the input data of Plan 2 was 1–14 channel brightness temperature, and Plan 3 added the BTD (10–12), BTD (11–12), and BTD (12–13) three brightness temperature difference, commonly used in cloud analysis [19,39]. BTD (10–12) was used to indicate the height of the cloud development, BTD (11–12) was used to differentiate between water and ice clouds, and BTD (12–13) was the split window channel brightness temperature difference that was used to distinguish between thin and thick clouds.

3. Results

3.1. Model Tuning

In the present study, Bayesian optimization was used to tune the hyperparameters of the model, and the area under the curve (AUC) was employed as the performance metric [41], which was insensitive to class skew. The optimizer used a 5-fold cross-validation scheme, and the optimal hyperparameter solution for the model was finally obtained by performing several iterations of the optimization process.
The hyperparameter optimization process corresponding to the three plans is shown in Figure 8, where the blue points indicate the results of each iteration, the orange points indicate the optimal results during the iteration process, and the labeled information is the corresponding optimal hyperparameter combination. From the figure, it can be observed that during the process of hyperparameter tuning, the AUC of each experimental group was improved to a certain extent. The optimum was reached in the 16th round of parameter tuning for Plan 1, which was 0.975, the highest value of cross-validated AUC for Plan 2 was 0.973, which was obtained in the 39th round of tuning, and, for Plan 3, the corresponding highest value was 0.994, which was reached in the 44th round of tuning.
We retrained the models using the optimal hyperparameter combinations from the three sets of plans and obtained three models: Model #1, Model #2, and Model #3. To show more intuitively the improvement of the training effect of hyperparameter optimization on the models, the training curves before and after hyperparameter optimization are compared in Figure 9. As can be seen from the figure, when the models were trained with the default hyperparameters, the log loss curves showed an obvious and drastic jitter, the accuracy curves of the models were also unstable, and the training curves of the three models showed different degrees of jitter after the 54th, 48th, and 43rd iterations, respectively. In contrast, the three models using the optimal hyperparameter combinations had good fitting results, with Acc on the training set reaching 0.921, 0.973, and 0.975, respectively.

3.2. Comparison of Different Channel Combination Models

In order to quantitatively assess the classification results of different channel combinations, the classification results of the three models were compared pixel-by-pixel with the quality-controlled Himawari-8 CLP product and evaluated according to different evaluation metrics. The classification results of the three models are shown in Figure 10. The comparison results of different channel combinations on the test set showed that the Acc of Model #2 improved from 92.10% to 97.33% as compared to Model #1, but was still lower than 97.54% of Model #3. Model #1 did not recognize Ac and Cu well, with a precision of 19.05% and 68.75%, while recall was only 2.31% and 0.91%. In contrast, Model #2 had a great improvement in recall for these two types of clouds, reaching 31.21% and 61.10%, respectively, and the F1 score was still higher than that of Model #1 despite a certain decrease in the precision rate, which was mainly due to the advantage of VIS channels in recognizing the thickness of the cloud. For different types of clouds, Model #3 always had the highest precision, recall, and F1 score. The Kappa coefficients of the three models were 0.843, 0.947, and 0.951, and there was a high degree of agreement between the model classification results and the true values. Overall, Model #3 had the best classification results, so Model 3 was selected as the final cloud classification model.

3.3. Model Evaluation

Comparing different combinations of input channels, we found that for most cloud types, Model #3 was the best model. However, different feature variables contributed differently to the cloud classification task, and the importance of different channels needed to be analyzed. Figure 11 shows the order of importance of the features of the model, where a larger value indicates that the feature was more important. The vertical coordinates are the different characterizing variables of the model, and the horizontal coordinates are the scores of the characterizing variables.
The feature importance analysis showed that the top three feature variables among all the feature variables of the model were the first channel (0.47 µm) albedo, the BTD (12–13), and the BTD (11–12), which indicated that there were more significant differences in the values of these features among different cloud types. The VIS and Shortwave infrared features and the bright temperature difference features accounted for about 43.79% and 21.84% of the overall features. The ISCCP cloud classification standard classifies clouds based on the cloud optical thickness and the cloud top pressure, so the first-channel albedo, which reflects the thickness of the clouds, and the BTD (12–13), which characterizes the cloud optical thickness, contributed the most to the classification of the cloud-type. The phases of clouds at different heights are also different, with low clouds dominated by water clouds, medium clouds mainly ice-water mixed clouds, and high clouds ice clouds [32,42]. Therefore, the BTD (11–12), which can characterize the cloud phase, was also important.
A visualization of the confusion matrix of the model’s recognition results on the testing set (Figure 12). As can be observed, the first column is relatively darker, which indicates that the model preferred to recognize other types of clouds, especially those with thin cloud thickness (Ci, Ac, and Cu) as clear. Ci, Ac, and Cu are usually located at the edge of the cloud mass and make up a lower proportion of the training samples than the other types of clouds, resulting in an imbalance between the multiple sample categories. At this point, the model was not able to learn enough from a small number of sample categories, and the model preferred clear skies when distinguishing similar spectral features. In addition, since edge clouds are more susceptible to parallax, the quality-controlled data may still have label-matching errors, affecting the model’s classification accuracy.

3.4. Comparison of Model Accuracy

To further evaluate the cloud classification results achieved by the algorithms in this paper, a comparison was made with the traditional thresholding method (TT) [18,19], support vector machine (SVM), and random forest (RF). Figure 13 shows the AGRI 13-channel bright temperature images and the cloud-type classification results for the four algorithms. All four algorithms could detect clouds with high cloud top heights, as shown in the figure. It was difficult to distinguish between clear and low clouds using the TT algorithm because it lacked information about the VIS channels.
To accurately assess the classification results of various algorithms, we performed a quantitative comparison of recognition results. The results of each algorithm on the testing set are shown in the table below. Table 3 shows that TT had the lowest recognition accuracy. The accuracy of all machine learning models, except for the TT model, was above 90%, and the kappa coefficient was higher than 0.8, indicating a strong consistency between the model classifications and actual values. Overall, these three machine learning models showed good classification performance on the testing set. The LightGBM model had the highest performance on all evaluation metrics, reaching 84.10%, 78.24%, 79.74%, 97.54%, and 0.951, respectively. Combining all the evaluation indicators, the LightGBM model had the best performance in terms of results.

4. Discussion

In this study, a machine learning method was proposed that aimed to achieve a cloud classification of FY-4A satellite images (ISCCP standard) by introducing Bayesian optimization. Performing quality control on cloud-type data effectively improved data quality and reduced the effect of inter-satellite parallax on model accuracy. In addition, the optimal model for cloud-type classification was found, through a set of sensitivity tests of input channel combinations. Compared with many other excellent cloud classification methods, the model proposed in this paper also achieved the best results. However, the model still has some limitations in recognizing certain types of clouds. Our work has some implications for future research on geostationary satellite cloud-type classification:
  • By comparing the performance of the model before and after hyperparameter optimization for the three different input channel combinations, it was demonstrated that Bayesian optimization effectively improved the model performance.
  • In evaluating the performance of the different input channel combination models on the testing set, we found that the advantage of the VIS and Shortwave channels in recognizing cloud thickness improved the classification effect of the models to a certain extent, especially the recall of Ac and Cu, which are two types of cloud types with thinner cloud thickness and lower cloud top height. However, because of the spectral similarity between clear and Ac and Cu, caused the model to misclassify clear as these two types of clouds, leading to decreased precision.
  • In evaluating the optimal model, we found that the VIS and Shortwave infrared channels and the brightness temperature difference channel accounted for 43.79% and 21.84% of the overall features, and the top three feature variables in terms of feature importance were the first-channel (0.47 μm) albedo, BTD (12–13), and BTD (11–12). When the model classifies cloud types, it is easy to identify clouds with thinner cloud thickness (Ci, Ac, and Cu) as clear. One reason for this is that Ci, Ac, and Cu clouds tend to be located at the edges of the cloud, which are represented by a smaller number of training samples than other cloud types. This can lead to the model not having enough learning ability for these categories. Additionally, edge clouds are more likely to be impacted by parallax effects, and, even after quality control, there may still be labeling errors that affect the accuracy of the model’s classifications.

5. Conclusions

In this paper, we proposed a cloud classification model for FY-4A satellite observation images that can be used in the southeast region of China. To address the problem of parallax between geostationary satellites, we used the sliding detection method for the quality control of the cloud-type label data to reduce the impact of label-matching errors on the accuracy of the model. The setting of hyperparameters will largely affect the classification effect of the machine learning model, and by introducing the Bayesian optimization method, we obtained the optimal hyperparameter combination of the model. We found that Bayesian optimization adjusted the model hyperparameters well. Compared with the model using only the bright temperature of the infrared channels, the introduction of albedo and bright temperature difference improved the sensitivity of the model for cloud thickness identification to some extent. Finally, we achieved a quantitative assessment of regional cloud-type classification. For validation, we compared it with the Himawari-8 CLP product after quality control and the Acc of the optimal model reached 97.54%, which was greater than the 51.06% of TT, 96.47% of SVM, and 97.49% of RF, and the model achieved the best results. Combined with the feature importance analysis, it further revealed the degree of influence of different input channels on the cloud classification task in the region. The results of this study can provide useful information for meteorological operations in the region.
However, some cloud types were not well recognized due to factors such as the model feature variables and the sample imbalance of each type of cloud. We plan to solve these problems by further improving the dataset and optimizing the model in our future work.

Author Contributions

Conceptualization, J.L., Y.B. and W.L.; methodology, J.L., Y.B., G.P.P. and F.P.; software, J.L.; validation, J.L.; formal analysis, J.L. and G.P.P.; data curation, J.L.; writing—original draft preparation, J.L.; writing—review and editing, Y.B., G.P.P., A.M., F.P. and W.L.; visualization, J.L. and A.M.; supervision, Y.B.; funding acquisition, Y.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research study was supported by the Natural Science Foundation of China (51827806), the Major Science and Technology Program of the Ministry of Water Resources of China (SKS-2022072), the Water Science and Technology Project of Jiangsu Province (2023022), Research Funds of Jiangsu Hydraulic Research Institute (2023z034), the Shanghai Aerospace Science and Technology lnnovation Foundation (SAST2021-032).

Data Availability Statement

The FY-4A data presented in this study are available at http://satellite.nsmc.org.cn/ (accessed on 13 March 2023). The Himawari-8 data used in this paper are available at https://www.eorc.jaxa.jp/ptree/index.html (accessed on 23 March 2023).

Acknowledgments

The authors would like to thank the China National Satellite Meteorological Center and the Himawari-8 data website, which are freely accessible to the public. The research product of cloud-type that was used in this paper was supplied by the P-Tree System, Japan Aerospace Exploration Agency (JAXA).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Fernández-Prieto, D.; Van Oevelen, P.; Su, Z.; Wagner, W. Editorial “Advances in Earth Observation for Water Cycle Science”. Hydrol. Earth Syst. Sci. 2012, 16, 543–549. [Google Scholar] [CrossRef]
  2. Baker, M.B.; Peter, T. Small-Scale Cloud Processes and Climate. Nature 2008, 451, 299–300. [Google Scholar] [CrossRef] [PubMed]
  3. Bengtsson, L. The Global Atmospheric Water Cycle. Environ. Res. Lett. 2010, 5, 025202. [Google Scholar] [CrossRef]
  4. Stephens, G.L. Cloud Feedbacks in the Climate System: A Critical Review. J. Clim. 2005, 18, 237–273. [Google Scholar] [CrossRef]
  5. Huang, J. Effects of Humidity, Aerosol, and Cloud on Subambient Radiative Cooling. Int. J. Heat Mass Transf. 2022, 186, 122438. [Google Scholar] [CrossRef]
  6. Li, Y.; Fang, L.; Kou, X. Principle and standard of auto-observation cloud classification for satellite, ground measurements and model. Chin. J. Geophys. 2014, 57, 2433–2441. [Google Scholar]
  7. Forsythe, J.M.; Vonder Haar, T.H.; Reinke, D.L. Cloud-Base Height Estimates Using a Combination of Meteorological Satellite Imagery and Surface Reports. J. Appl. Meteor. 2000, 39, 2336–2347. [Google Scholar] [CrossRef]
  8. Beusch, L.; Foresti, L.; Gabella, M.; Hamann, U. Satellite-Based Rainfall Retrieval: From Generalized Linear Models to Artificial Neural Networks. Remote Sens. 2018, 6, 939. [Google Scholar] [CrossRef]
  9. Ren, J.; Xu, G.; Zhang, W.; Leng, L.; Xiao, Y.; Wan, R.; Wang, J. Evaluation and Improvement of FY-4A AGRI Quantitative Precipitation Estimation for Summer Precipitation over Complex Topography of Western China. Remote Sens. 2021, 13, 4366. [Google Scholar] [CrossRef]
  10. Rucong, Y.; Yongqiang, Y.; Minghua, Z. Comparing Cloud Radiative Properties between the Eastern China and the Indian Monsoon Region. Adv. Atmos. Sci. 2001, 18, 1090–1102. [Google Scholar] [CrossRef]
  11. Duan, A.; Hu, D.; Hu, W.; Zhang, P. Precursor Effect of the Tibetan Plateau Heating Anomaly on the Seasonal March of the East Asian Summer Monsoon Precipitation. JGR Atmos. 2020, 125, e2020JD032948. [Google Scholar] [CrossRef]
  12. Chiang, J.C.H.; Kong, W.; Wu, C.H.; Battisti, D.S. Origins of East Asian Summer Monsoon Seasonality. J. Clim. 2020, 33, 7945–7965. [Google Scholar] [CrossRef]
  13. Yihui, D.; Chan, J.C.L. The East Asian Summer Monsoon: An Overview. Meteorol. Atmos. Phys. 2005, 89, 117–142. [Google Scholar] [CrossRef]
  14. Doswell, C.A. Severe Convective Storms—An Overview. In Meteorological Monographs; Springer: Berlin/Heidelberg, Germany, 2001; Volume 50, pp. 1–26. [Google Scholar] [CrossRef]
  15. Schiffer, R.A.; Rossow, W.B. The International Satellite Cloud Climatology Project (ISCCP): The First Project of the World Climate Research Programme. Bull. Am. Meteorol. Soc. 1983, 64, 779–784. [Google Scholar] [CrossRef]
  16. Tapakis, R.; Charalambides, A.G. Equipment and Methodologies for Cloud Detection and Classification: A Review. Sol. Energy 2013, 95, 392–430. [Google Scholar] [CrossRef]
  17. Inoue, T. A Cloud Type Classification with NOAA 7 Split-Window Measurements. J. Geophys. Res. 1987, 92, 3991. [Google Scholar] [CrossRef]
  18. Lutz, H.-J.; Inoue, T.; Schmetz, J. Comparison of a Split-Window and a Multi-Spectral Cloud Classification for MODIS Observations. J. Meteorol. Soc. Jpn. 2003, 81, 623–631. [Google Scholar] [CrossRef]
  19. Purbantoro, B.; Aminuddin, J.; Manago, N.; Toyoshima, K.; Lagrosas, N.; Sumantyo, J.T.S.; Kuze, H. Comparison of Cloud Type Classification with Split Window Algorithm Based on Different Infrared Band Combinations of Himawari-8 Satellite. ARS 2018, 7, 218–234. [Google Scholar] [CrossRef]
  20. Christodoulou, C.I.; Michaelides, S.C.; Pattichis, C.S. Multifeature Texture Analysis for the Classification of Clouds in Satellite Imagery. IEEE Trans. Geosci. Remote Sens. 2003, 41, 2662–2668. [Google Scholar] [CrossRef]
  21. Zhang, C.; Zhuge, X.; Yu, F. Development of a High Spatiotemporal Resolution Cloud-Type Classification Approach Using Himawari-8 and CloudSat. Int. J. Remote Sens. 2019, 40, 6464–6481. [Google Scholar] [CrossRef]
  22. Azimi-Sadjadi, M.R.; Zekavat, S.A. Cloud Classification Using Support Vector Machines. In Proceedings of the IGARSS 2000. IEEE 2000 International Geoscience and Remote Sensing Symposium. Taking the Pulse of the Planet: The Role of Remote Sensing in Managing the Environment. Proceedings (Cat. No.00CH37120), Honolulu, HI, USA, 24–28 July 2000; Volume 2, pp. 669–671. [Google Scholar]
  23. Liu, C.; Yang, S.; Di, D.; Yang, Y.; Zhou, C.; Hu, X.; Sohn, B.-J. A Machine Learning-Based Cloud Detection Algorithm for the Himawari-8 Spectral Image. Adv. Atmos. Sci. 2022, 39, 1994–2007. [Google Scholar] [CrossRef]
  24. Jiang, Y.; Cheng, W.; Gao, F.; Zhang, S.; Wang, S.; Liu, C.; Liu, J. A Cloud Classification Method Based on a Convolutional Neural Network for FY-4A Satellites. Remote Sens. 2022, 14, 2314. [Google Scholar] [CrossRef]
  25. Li, T.; Wu, D.; Wang, L.; Yu, X. Recognition Algorithm for Deep Convective Clouds Based on FY4A. Neural Comput. Applic 2022, 34, 21067–21088. [Google Scholar] [CrossRef]
  26. Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; Ye, Q.; Liu, T.-Y. LightGBM: A Highly Efficient Gradient Boosting Decision Tree. In Advances in Neural Information Processing Systems, Proceedings of the NIPS 2017, Long Beach, CA, USA, 4–9 December 2017; pp. 3149–3157. [CrossRef]
  27. Lin, N.; Zhang, D.; Feng, S.; Ding, K.; Tan, L.; Wang, B.; Chen, T.; Li, W.; Dai, X.; Pan, J.; et al. Rapid Landslide Extraction from High-Resolution Remote Sensing Images Using SHAP-OPT-XGBoost. Remote Sens. 2023, 15, 3901. [Google Scholar] [CrossRef]
  28. Dai, J.; Liu, T.; Zhao, Y.; Tian, S.; Ye, C.; Nie, Z. Remote Sensing Inversion of the Zabuye Salt Lake in Tibet, China Using LightGBM Algorithm. Front. Earth Sci. 2023, 10, 1022280. [Google Scholar] [CrossRef]
  29. Zhong, J.; Zhang, X.; Gui, K.; Wang, Y.; Che, H.; Shen, X.; Zhang, L.; Zhang, Y.; Sun, J.; Zhang, W. Robust Prediction of Hourly PM2.5 from Meteorological Data Using LightGBM. Natl. Sci. Rev. 2021, 8, nwaa307. [Google Scholar] [CrossRef]
  30. Zhang, P.; Zhu, L.; Tang, S.; Gao, L.; Chen, L.; Zheng, W.; Han, X.; Chen, J.; Shao, J. General Comparison of FY-4A/AGRI with Other GEO/LEO Instruments and Its Potential and Challenges in Non-Meteorological Applications. Front. Earth Sci. 2019, 6, 224. [Google Scholar] [CrossRef]
  31. Letu, H.; Nagao, T.M.; Nakajima, T.Y.; Riedi, J.; Ishimoto, H.; Baran, A.J.; Shang, H.; Sekiguchi, M.; Kikuchi, M. Ice Cloud Properties from Himawari-8/AHI Next-Generation Geostationary Satellite: Capability of the AHI to Monitor the DC Cloud Generation Process. IEEE Trans. Geosci. Remote Sens. 2019, 57, 3229–3239. [Google Scholar] [CrossRef]
  32. Rossow, W.B.; Schiffer, R.A. Advances in Understanding Clouds from ISCCP. Bull. Am. Meteorol. Soc. 1999, 80, 2261–2287. [Google Scholar] [CrossRef]
  33. Kim, M.; Kim, J.; Lim, H.; Lee, S.; Cho, Y.; Yeo, H.; Kim, S.-W. Exploring Geometrical Stereoscopic Aerosol Top Height Retrieval from Geostationary Satellite Imagery in East Asia. Atmos. Meas. Tech. 2023, 16, 2673–2690. [Google Scholar] [CrossRef]
  34. Brochu, E.; Cora, V.M.; de Freitas, N. A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning. 2010. Available online: https://www.cs.ox.ac.uk/publications/publication7472-abstract.html (accessed on 19 September 2023).
  35. Brusca, S.; Famoso, F.; Lanzafame, R.; Messina, M.; Monforte, P. Placement Optimization of Biodiesel Production Plant by Means of Centroid Mathematical Method. Energy Procedia 2017, 126, 353–360. [Google Scholar] [CrossRef]
  36. Hao, X.; Zhang, Z.; Xu, Q.; Huang, G.; Wang, K. Prediction of F-CaO Content in Cement Clinker: A Novel Prediction Method Based on LightGBM and Bayesian Optimization. Chemom. Intell. Lab. Syst. 2022, 220, 104461. [Google Scholar] [CrossRef]
  37. Mohajerani, S.; Saeedi, P. Cloud-Net: An End-To-End Cloud Detection Algorithm for Landsat 8 Imagery. In Proceedings of the IGARSS 2019–2019 IEEE International Geoscience and Remote Sensing Symposium, Yokohama, Japan, 28 July–2 August 2019; pp. 1029–1032. [Google Scholar]
  38. Jeppesen, J.H.; Jacobsen, R.H.; Inceoglu, F.; Toftegaard, T.S. A Cloud Detection Algorithm for Satellite Imagery Based on Deep Learning. Remote Sens. Environ. 2019, 229, 247–259. [Google Scholar] [CrossRef]
  39. Yu, Z.; Ma, S.; Han, D.; Li, G.; Gao, D.; Yan, W. A Cloud Classification Method Based on Random Forest for FY-4A. Int. J. Remote Sens. 2021, 42, 3353–3379. [Google Scholar] [CrossRef]
  40. Wang, B.; Zhou, M.; Cheng, W.; Chen, Y.; Sheng, Q.; Li, J.; Wang, L. An Efficient Cloud Classification Method Based on a Densely Connected Hybrid Convolutional Network for FY-4A. Remote Sens. 2023, 15, 2673. [Google Scholar] [CrossRef]
  41. Xie, L.; Zhang, R.; Zhan, J.; Li, S.; Shama, A.; Zhan, R.; Wang, T.; Lv, J.; Bao, X.; Wu, R. Wildfire Risk Assessment in Liangshan Prefecture, China Based on An Integration Machine Learning Algorithm. Remote Sens. 2022, 14, 4592. [Google Scholar] [CrossRef]
  42. Liu, J.; Li, Y. Cloud phase detection algorithm for geostationary satellite data: Cloud phase detection algorithm for geostationary satellite data. J. Infrared Millim. Waves 2012, 30, 322–327. [Google Scholar] [CrossRef]
Figure 1. Southeast China region, filled colors indicate altitude, black box shows the study area.
Figure 1. Southeast China region, filled colors indicate altitude, black box shows the study area.
Remotesensing 15 05660 g001
Figure 2. ISCCP cloud type definition [32].
Figure 2. ISCCP cloud type definition [32].
Remotesensing 15 05660 g002
Figure 3. The Himawari-8 satellite cloud-type product at 06:00 UTC on 1 July 2022.
Figure 3. The Himawari-8 satellite cloud-type product at 06:00 UTC on 1 July 2022.
Remotesensing 15 05660 g003
Figure 4. Schematic diagram of joint AGRI and AHI observations [33].
Figure 4. Schematic diagram of joint AGRI and AHI observations [33].
Remotesensing 15 05660 g004
Figure 5. An example of mapping AHI cloud types to AGRI images, 1 July 2022 6:00 UTC AGRI 13-channel bright temperature and ISCCP standard cloud types: (a) Cloud type data before quality control (b) Cloud type data after quality control.
Figure 5. An example of mapping AHI cloud types to AGRI images, 1 July 2022 6:00 UTC AGRI 13-channel bright temperature and ISCCP standard cloud types: (a) Cloud type data before quality control (b) Cloud type data after quality control.
Remotesensing 15 05660 g005
Figure 6. Schematic of Level-wise (a) and Leaf-wise (b) growth strategies.
Figure 6. Schematic of Level-wise (a) and Leaf-wise (b) growth strategies.
Remotesensing 15 05660 g006
Figure 7. Flowchart of FY-4A AGRI Cloud Classification Model.
Figure 7. Flowchart of FY-4A AGRI Cloud Classification Model.
Remotesensing 15 05660 g007
Figure 8. Hyperparameter optimization process (a) Plan 1 (b) Plan 2 (c) Plan 3. The blue points represent the results of each iteration and the orange points represent the best results of the iteration process.
Figure 8. Hyperparameter optimization process (a) Plan 1 (b) Plan 2 (c) Plan 3. The blue points represent the results of each iteration and the orange points represent the best results of the iteration process.
Remotesensing 15 05660 g008aRemotesensing 15 05660 g008b
Figure 9. Log loss curves before (a) and after (b) hyperparameter optimization and Acc curves before (c) and after (d) hyperparameter optimization during the model training process.
Figure 9. Log loss curves before (a) and after (b) hyperparameter optimization and Acc curves before (c) and after (d) hyperparameter optimization during the model training process.
Remotesensing 15 05660 g009
Figure 10. Classification results for all models (a) Precision (b) Recall (c) F1 score.
Figure 10. Classification results for all models (a) Precision (b) Recall (c) F1 score.
Remotesensing 15 05660 g010
Figure 11. Ranking the importance of the input feature variables.
Figure 11. Ranking the importance of the input feature variables.
Remotesensing 15 05660 g011
Figure 12. Confusion matrix for recognition results.
Figure 12. Confusion matrix for recognition results.
Remotesensing 15 05660 g012
Figure 13. An Example of (a) AGRI 13-channel brightness temperature images of tropical cyclone “Chaba” and cloud type classification results from (b) LightGBM, (c) TT, (d) SVM, and (e) RF.
Figure 13. An Example of (a) AGRI 13-channel brightness temperature images of tropical cyclone “Chaba” and cloud type classification results from (b) LightGBM, (c) TT, (d) SVM, and (e) RF.
Remotesensing 15 05660 g013
Table 1. FY-4A AGRI channel information.
Table 1. FY-4A AGRI channel information.
BandSpectral CoverageCentral WavelengthSpectral BandwidthMain Applications
1VIS/NIR0.47 µm0.45~0.49 µmAerosol, visibility
20.65 µm0.55~0.75 µmFog and cloud
30.825 µm0.75~0.90 µmVegetation
4Shortwave IR1.375 µm1.36~1.39 µmCirrus cloud
51.61 µm1.58~1.64 µmCloud and snow
62.25 µm2.1~2.35 µmCirrus clouds and aerosol
7Midwave IR3.75 µm3.5~4.0 µmFire point
83.75 µm3.5~4.0 µmEarth’s surface
9Water vapor6.25 µm5.8~6.7 µmUpper-level WV
107.1 µm6.9~7.3 µmMid-level WV
11Longwave IR8.5 µm8.0~9.0 µmCloud motion wind and cloud
1210.7 µm10.3~11.3 µmSea surface temperature
1312.0 µm11.5~12.5 µmSea surface temperature
1413.5 µm13.2~13.8 µmCloud top height
Table 2. Different channel combination settings.
Table 2. Different channel combination settings.
No.Channel Combinations
1Chn07~Chn14
2Chn01~chn14
3Chn01~chn14+ BTD (10−12) + BTD (11−12) + BTD (12−13)
Table 3. Comparison of classification results of different algorithms.
Table 3. Comparison of classification results of different algorithms.
AlgorithmsMacro PMacro RMacro F1 AccKappa
LightGBM84.10%78.24%79.74%97.54%0.951
TT26.60%31.20%22.92%51.06%0.351
SVM70.00%56.98%59.63%96.47%0.929
RF73.11%73.03%72.97%97.49%0.950
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Lin, J.; Bao, Y.; Petropoulos, G.P.; Mehraban, A.; Pang, F.; Liu, W. Cloud-Type Classification for Southeast China Based on Geostationary Orbit EO Datasets and the LighGBM Model. Remote Sens. 2023, 15, 5660. https://doi.org/10.3390/rs15245660

AMA Style

Lin J, Bao Y, Petropoulos GP, Mehraban A, Pang F, Liu W. Cloud-Type Classification for Southeast China Based on Geostationary Orbit EO Datasets and the LighGBM Model. Remote Sensing. 2023; 15(24):5660. https://doi.org/10.3390/rs15245660

Chicago/Turabian Style

Lin, Jianan, Yansong Bao, George P. Petropoulos, Abouzar Mehraban, Fang Pang, and Wei Liu. 2023. "Cloud-Type Classification for Southeast China Based on Geostationary Orbit EO Datasets and the LighGBM Model" Remote Sensing 15, no. 24: 5660. https://doi.org/10.3390/rs15245660

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop