All-Day Cloud Classification via a Random Forest Algorithm Based on Satellite Data from CloudSat and Himawari-8
Abstract
:1. Introduction
2. Data and Methods
2.1. Satellite Data
2.1.1. CloudSat Satellite Data
2.1.2. Himawari-8/9 Satellite Data
2.1.3. FY-4A Satellite Data
2.2. Methods
2.2.1. The Determination of CloudSat Cloud Types
- (1)
- Samples with a total cloud thickness of less than 500 m (about 2 layers) were considered clear skies.
- (2)
- Samples with a single type of cloud among all their vertical layers were represented by that type of cloud. Stratus and stratocumulus clouds were regarded as low clouds; altocumulus, altostratus, cumulus and nimbostratus clouds were regarded as middle clouds; and cirrus clouds were separately regarded as thick or thin cirrus clouds according to a cloud thickness greater or less than 3000 m (about 15 layers), respectively.
- (3)
- For samples with more than two types of clouds among all their vertical layers, the representative types of which were determined via the flow chart in Figure 2, the type of multi-layer cloud was considered on the basis of single cloud types.
2.2.2. Matching Satellite Data and Selecting Himawari-8/9 Spectral Bands
2.2.3. Unit Feature Space Classification
2.2.4. Random Forest Algorithm
3. Results
3.1. Spectral Features of Various Cloud Types
3.1.1. Daytime Spectral Features
- Low clouds have high brightness temperature values distributed in a range of normalized values between 70% and 100% in band 14 (Figure 5a); this is obviously different from thick cirrus and deep convection clouds, which have values below 50% and 40% (Figure 5b,c). This significant difference between cloud types is beneficial for sample training and classification.
- The reflectivity values of thick cirrus clouds have a distribution area that overlaps with the distribution area of deep convection clouds, especially in the range of normalized values between 60% and 80% in band 03 (Figure 5b,c). This confusion is due to the fact that cirrus clouds are high clouds whose cloud-bottom heights are usually above 6 km; in the case of a cirrus cloud thicker than 3 km, the cloud-top height may be close to 10 km. Therefore, satellites easily misjudge these thick cirrus clouds as developing cumulonimbus clouds.
3.1.2. Nighttime Spectral Features
- For middle clouds, the probability density distribution in the spectral space built by band 14 and band 14–10 is more concentrated than in the other two kinds of spectral space (Figure 6a). For multi-layer clouds and deep convection clouds, the difference between the probability density distributions in the three kinds of feature space is not obvious (Figure 6b,c).
- Middle clouds have a wide range of normalized values between 20% and 80% in band 14–10 and band 14–11 (Figure 6a). Multi-layer clouds have values above 40% in all three spectral D-value bands (Figure 6b). Deep convection clouds have a range of values between 10% and 90% in band 14–10, wider than those in the other two kinds of spectral space (Figure 6c). Compared to the daytime distributions, the differences between various types of clouds are less significant, indicating that visible light band 03 provides an important contribution to the work of cloud classification.
- In all kinds of space, middle clouds and deep convection clouds have overlapping distribution areas (Figure 6a,c), a phenomenon that is more concentrated in band 14–10. This confusion is due to the fact that the differences in the infrared band D-values between these two cloud types are far less significant than the difference in visible band reflectivity; this is also one of the reasons why it is more difficult to perform cloud classification at night than during the day.
3.2. Results of the Cloud Classification
3.2.1. Daytime Results
- During the daytime, the average accuracy of the classifier for all cloud types is 88.4%, and the accuracy of the classifier for a clear sky is 100%, indicating that the classifier can credibly determine the presence or absence of clouds, and a clear sky would not be distinguished as a cloudy type of weather. In addition, the accuracies of middle cloud and thin cirrus cloud classifications are above 95%, followed by cumulonimbus clouds at 87.5% and low clouds at 82.1%. The classifications of thick cirrus and multi-layer clouds have lower accuracies of 75% and 77.1%.
- Among the various types of cloudy weather, about 13.4% of low clouds are identified as middle clouds, and another 3.6% are identified as multi-layer clouds. For multi-layer clouds, 7.1% are distinguished as low clouds, and another 15.8% are identified as middle clouds. Because some multi-layer clouds are composed of low or middle clouds covered with cirrus clouds, if the upper cirrus cloud has a small thickness and a high level of radiation transmittance, geostationary satellites will receive more radiation from the middle and low clouds, resulting in classification confusion.
- This confusion also exists for thick cirrus and cumulonimbus clouds; about 20% of thick cirrus clouds are identified as cumulonimbus clouds, while 12.5% of cumulonimbus clouds are regarded as the former. Although the daytime classifier applies seven bands to distinguish cloud types, as mentioned in Section 3.1.1, thick cirrus clouds of certain thicknesses show similar spectral features to cumulonimbus clouds in both the visible light and infrared bands. This confusion usually causes interference for the training sample, which was input into a random forest algorithm model, leading to erroneous cloud classification results.
3.2.2. Nighttime Results
- During the nighttime, the average accuracy of the classifier for all cloud types is 79.1%, which is 9.3% lower than the accuracy of the daytime classifier. Because visible light bands are unavailable at night, the nighttime classifier instead applies the D-value of the brightness temperature between infrared bands, although it is still difficult to achieve the same effect as the visible band. The positive aspect of this is that the accuracy of identifying a clear sky still reaches 100% at night, so these two classifiers can be combined to accurately distinguish between a clear sky and cloudy weather for an entire day.
- Among the various types of cloudy weather, about 14.7% of low clouds and 12.5% of middle clouds are identified as thick cirrus clouds, which differs from the confusion between low, middle and multi-layer clouds during the daytime. This is also due to the lack of availability of visible light bands at night and that these three cloud types do not differ significantly in brightness temperature or D-value between infrared bands.
- Similar to the daytime classifier, the nighttime classifier also confuses thick cirrus and cumulonimbus clouds. About 18.3% of thick cirrus clouds are identified as cumulonimbus clouds, while 9.4% of cumulonimbus clouds are regarded as the former. Compared to the daytime situation, the false positive rates for both types are reduced. This is due to the fact that, compared to cumulonimbus clouds, thick cirrus clouds have significantly different D-values between infrared bands; however, this difference is insignificant in visible light bands. Thus, the accuracy is improved by expunging band 03.
3.3. Typhoon Case Study
- With the joint inversion of the daytime and nighttime cloud classifiers, the cloud system types and features of Typhoon Muifa were analyzed (Figure 7a). The center of the typhoon was a clear sky area, i.e., the eye of the typhoon. The center outward comprised the main body of the typhoon, which mainly consisted of cumulonimbus (deep convection) and multi-layer clouds. Several broad spiral cloud bands surrounded the main body, comprising cirrus and multi-layered clouds. As the intensity of the typhoon increased, convection developed and the ascending motion inside the typhoon intensified; therefore, the cumulonimbus area expanded and presented as a mesoscale convective system (MCS) on the images of the classification results. Because the FY-4A CLT product contains no cumulonimbus type, the method proposed in this paper can more effectively analyze the structure and evolution of a typhoon.
- The results adjusted to the CLT types (Figure 7b) and the CLT product (Figure 7c) were compared to examine the effects of the cloud classifiers proposed in this paper. The results show that the classification of ice-type (thick cirrus) and multi-layer clouds is effective since the locations, shapes and sizes of these two were similar. As for low, middle and thin cirrus clouds, the classifiers provided some misclassifications. This is not only because of errors from the sample sets or random forest model but also observation errors. Himawari-8/9 and FY-4A have different detectors and detection methods, making for different radiative detection values in spectral bands which affect the results obtained via both the classifier and the CLT product.
- A large number of water types (low clouds) were distinguished as clear sky; one reason for this is the satellite detection error mentioned in (2), and the other reason is that the ocean and land were not separated for classification in this paper. In visible light bands, the ocean has a darker hue than the land does; additionally, typhoons are usually generated in the summer, so the brightness temperature of the sea is lower than that of the land in infrared bands. These two differences lead to the result that the radiative features of low clouds over the land and clear sky over the sea are very similar to each other, which interferes with the training results of the classifiers.
4. Discussion
5. Conclusions
- Various cloud types have different probability density distributions in the majority of geostationary satellite bands. During the daytime, the most obvious difference appears in the two-dimensional spectral space composed of visible light band 03 and infrared band 14, while the most obvious difference at night appears in the spectral space comprising infrared band 14 and the D-value between band 14 and 11.
- The daytime classifier has an accuracy of 88.4% for all cloud types, and the accuracies for a clear sky, middle clouds and thin cirrus clouds are above 95%; the other types have accuracies between 75% and 87.5%. Regarding low clouds, 17% are identified as multi-layer clouds, and 22.9% of the latter are identified as low and middle clouds. Between thick cirrus and cumulonimbus clouds, 20% and 12.5% are mistaken for one another. The nighttime classifier has an accuracy of 79.1% for all cloud types, which is 9.3% lower than the daytime accuracy. The classification accuracies for a clear sky, thin cirrus clouds and multi-layer clouds are above 95%, while other types have accuracies between 47.9% and 84.4%. For low and middle clouds, 14.7% and 12.5% are identified as thin cirrus clouds, respectively. Between thick cirrus and cumulonimbus clouds, 18.3% and 9.4% are mistaken for one another. Compared to the daytime situation, false positives are reduced.
- In the case of Typhoon Muifa, the center of the typhoon comprised a clear sky, the main body of the typhoon comprised cumulonimbus and multi-layer clouds and the spiral cloud bands that surrounded the main body consisted of cirrus and multi-layered clouds. The cumulonimbus area classified by the classifiers corresponded well with a mesoscale convective system (MCS). Compared to the FY-4A CLT product, the classifications of ice-type (thick cirrus) and multi-layer clouds are effective since the locations, shapes and sizes of these two were similar. As for low, middle and thin cirrus clouds, the classifiers provided some misclassifications. Additionally, the ocean and land were not separated for classification, leading to confusion between low clouds over the ocean and clear skies.
- This study on cloud classification using multiple satellite data can distinguish cloud types both during the day and at night and characterize continuous changes in cloud systems in weather processes. It shows good application value and is worth further study. The errors in this study partly come from the observation error caused by the different detection methods of multiple satellites, which affected the data quality. The other portion of errors was caused by the similarity of certain cloud types in spectral bands, which interfered with the random forest model. As a next step, we will attempt to reduce errors stemming from these two aspects while improving the random forest model for further study.
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Liu, Y.; Xia, J.; Shi, C.-X.; Hong, Y. An Improved Cloud Classification Algorithm for China’s FY-2C Multi-Channel Images Using Artificial Neural Network. Sensors 2009, 9, 5558–5579. [Google Scholar] [CrossRef] [PubMed]
- Murakami, M.; Clark, T.L.; Hall, W.D. Numerical Simulations of Convective Snow Clouds over the Sea of Japan. J. Meteorol. Soc. Jpn. Ser. II 1994, 72, 43–62. [Google Scholar] [CrossRef]
- Stratmann, F.; Kiselev, A.; Wurzler, S.; Wendisch, M.; Heintzenberg, J.; Charlson, R.J.; Diehl, K.; Wex, H.; Schmidt, S. Laboratory Studies and Numerical Simulations of Cloud Droplet Formation under Realistic Supersaturation Conditions. J. Atmos. Ocean. Technol. 2004, 21, 876–887. [Google Scholar] [CrossRef]
- Stephens, G.L. Cloud Feedbacks in the Climate System: A Critical Review. J. Clim. 2005, 18, 237–273. [Google Scholar] [CrossRef]
- Hong, Y.; Hsu, K.-L.; Sorooshian, S.; Gao, X. Precipitation Estimation from Remotely Sensed Imagery Using an Artificial Neural Network Cloud Classification System. J. Appl. Meteorol. 2004, 43, 1834–1853. [Google Scholar] [CrossRef]
- Udelhofen, P.M.; Hartmann, D.L. Influence of tropical cloud systems on the relative humidity in the upper troposphere. J. Geophys. Res. Atmos. 1995, 100, 7423–7440. [Google Scholar] [CrossRef]
- Inoue, T.; Kamahori, H. Statistical Relationship between ISCCP Cloud Type and Vertical Relative Humidity Profile. J. Meteorol. Soc. Jpn. Ser. II 2001, 79, 1243–1256. [Google Scholar] [CrossRef]
- Schmetz, J.; Holmlund, K.; Hoffman, J.; Strauss, B.; Mason, B.; Gaertner, V.; Koch, A.; Van De Berg, L. Operational Cloud-Motion Winds from Meteosat Infrared Images. J. Appl. Meteorol. 1993, 32, 1206–1225. [Google Scholar] [CrossRef]
- Sieglaff, J.M.; Cronce, L.M.; Feltz, W.F.; Bedka, K.M.; Pavolonis, M.J.; Heidinger, A.K. Nowcasting Convective Storm Initiation Using Satellite-Based Box-Averaged Cloud-Top Cooling and Cloud-Type Trends. J. Appl. Meteorol. Clim. 2011, 50, 110–126. [Google Scholar] [CrossRef]
- Snodgrass, E.R.; Di Girolamo, L.; Rauber, R.M. Precipitation Characteristics of Trade Wind Clouds during RICO Derived from Radar, Satellite, and Aircraft Measurements. J. Appl. Meteorol. Clim. 2009, 48, 464–483. [Google Scholar] [CrossRef]
- Purbantoro, B.; Aminuddin, J.; Manago, N.; Toyoshima, K.; Lagrosas, N.; Sumantyo, J.T.S.; Kuze, H. Comparison of Cloud Type Classification with Split Window Algorithm Based on Different Infrared Band Combinations of Himawari-8 Satellite. Adv. Remote Sens. 2018, 7, 218–234. [Google Scholar] [CrossRef]
- Liang, P.; Chen, B.D.; Tang, X. Identification of cloud types over Tibetan Plateau by satellite remote sensing. Plateau Meteor 2010, 29, 268–277. (In Chinese) [Google Scholar] [CrossRef]
- CloudSat Project, 2008: CloudSat Standard Data Products Handbook; CIRA, Colorado State University: Fort Collins, CO, USA, 2008; 5p, Available online: http://www.cloudsat.cira.colostate.edu/cloudsat_documentation/CloudSat_Data_Users_Handbook.pdf (accessed on 27 June 2023).
- Tanelli, S.; Durden, S.L.; Im, E.; Pak, K.S.; Reinke, D.G.; Partain, P.; Haynes, J.M.; Marchand, R.T. CloudSat’s Cloud Profiling Radar After Two Years in Orbit: Performance, Calibration, and Processing. IEEE Trans. Geosci. Remote Sens. 2008, 46, 3560–3573. [Google Scholar] [CrossRef]
- Zhang, C.; Zhuge, X.; Yu, F. Development of a high spatiotemporal resolution cloud-type classification approach using Himawari-8 and CloudSat. Int. J. Remote Sens. 2019, 40, 6464–6481. [Google Scholar] [CrossRef]
- Liu, C.; Chiu, C.; Lin, P.; Min, M. Comparison of Cloud-Top Property Retrievals from Advanced Himawari Imager, MODIS, CloudSat/CPR, CALIPSO/CALIOP, and Radiosonde. J. Geophys. Res. Atmos. 2020, 125, e2020JD032683. [Google Scholar] [CrossRef]
- Jiang, Y.; Cheng, W.; Gao, F.; Zhang, S.; Wang, S.; Liu, C.; Liu, J. A Cloud Classification Method Based on a Convolutional Neural Network for FY-4A Satellites. Remote Sens. 2022, 14, 2314. [Google Scholar] [CrossRef]
- Li, W.; Zhang, F.; Lin, H.; Chen, X.; Li, J.; Han, W. Cloud Detection and Classification Algorithms for Himawari-8 Imager Measurements Based on Deep Learning. IEEE Trans. Geosci. Remote Sens. 2022, 60, 3153129. [Google Scholar] [CrossRef]
- Ghasemian, N.; Akhoondzadeh, M. Introducing two Random Forest based methods for cloud detection in remote sensing images. Adv. Space Res. 2018, 62, 288–303. [Google Scholar] [CrossRef]
- Thampi, B.V.; Wong, T.; Lukashin, C.; Loeb, N.G. Determination of CERES TOA Fluxes Using Machine Learning Algorithms. Part I: Classification and Retrieval of CERES Cloudy and Clear Scenes. J. Atmos. Ocean. Technol. 2017, 34, 2329–2345. [Google Scholar] [CrossRef]
- Tan, Z.; Huo, J.; Ma, S.; Han, D.; Wang, X.; Hu, S.; Yan, W. Estimating cloud base height from Himawari-8 based on a random forest algorithm. Int. J. Remote Sens. 2021, 42, 2485–2501. [Google Scholar] [CrossRef]
- Tan, Z.; Ma, S.; Han, D.; Gao, D.; Yan, W. Estimation of cloud base height for fy-4a satellite based on random forest algorithm. J. Infrared Millim. Waves 2019, 38, 381–388. (In Chinese) [Google Scholar]
- Wang, S.-H.; Han, Z.-G.; Yao, Z.-G. Comparison of cloud amounts from isccp and cloudsat over China and its neighborhood. Chin. J. Atmos. Sci. 2010, 34, 767–779. (In Chinese) [Google Scholar]
- Wrenn, F. Documenting Hydrometeor Layer Occurrence within International Satellite Cloud Climatology Project-Defined Cloud Classifications Using CLOUDSAT and CALIPSO. Ph.D. Thesis, The University of Utah, Salt Lake City, UT, USA, 2012. [Google Scholar]
- Zhang, C.-W. The Cloud-Type Classification Research and Its Application for the New Generation Geostationary Satellite Himawari-8. Ph.D. Thesis, Nanjing University, Nanjing, China, 2019. (In Chinese). [Google Scholar] [CrossRef]
- Yu, Z.; Ma, S.; Han, D.; Li, G.; Gao, D.; Yan, W. A cloud classification method based on random forest for FY-4A. Int. J. Remote Sens. 2021, 42, 3353–3379. [Google Scholar] [CrossRef]
- Li, R.; Wang, G.; Zhou, R.; Zhang, J.; Liu, L. Seasonal Variation in Microphysical Characteristics of Precipitation at the Entrance of Water Vapor Channel in Yarlung Zangbo Grand Canyon. Remote Sens. 2022, 14, 3149. [Google Scholar] [CrossRef]
- General Meteorological Standards and Recommended Practices. Technical Regulations, Volume I (WMO-No. 49), 2019 Edition. Available online: https://cloudatlas.wmo.int/en/clouds-definitions.html (accessed on 20 June 2023).
- FY-4A Satellite. Available online: http://www.nsmc.org.cn/nsmc/en/satellite/FY4A.html (accessed on 27 June 2023).
Band | Central Wavelength (μm) | Spatial Resolution (km) |
---|---|---|
01 | 0.47 | 1.0 |
02 | 0.51 | 1.0 |
03 | 0.64 | 0.5 |
04 | 0.86 | 1.0 |
05 | 1.6 | 2.0 |
06 | 2.3 | 2.0 |
07 | 3.9 | 2.0 |
08 | 6.2 | 2.0 |
09 | 6.9 | 2.0 |
10 | 7.3 | 2.0 |
11 | 8.6 | 2.0 |
12 | 9.6 | 2.0 |
13 | 10.4 | 2.0 |
14 | 11.2 | 2.0 |
15 | 12.3 | 2.0 |
16 | 13.3 | 2.0 |
Band | Central Wavelength (μm) | Spatial Resolution (km) | Standardization Scope |
---|---|---|---|
03 | 0.64 | 0.5 | [0, 100] (%) |
05 | 1.6 | 2.0 | [0, 100] (%) |
14 | 11.2 | 2.0 | [200, 300] (K) |
14–07 | 11.2–3.9 | 2.0 | [−100, 20] (K) |
14–10 | 11.2–7.3 | 2.0 | [−4, 46] (K) |
14–11 | 11.2–8.6 | 2.0 | [−16, 4] (K) |
14–15 | 11.2–12.3 | 2.0 | [−4, 16] (K) |
Classifier 1 (Daytime) | Clear | Low Clouds | Middle Clouds | Thin Cirrus | Thick Cirrus | Multi-Layer Clouds | Cumulonimbus |
---|---|---|---|---|---|---|---|
Clear | 100.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
Low clouds | 0.00 | 82.14 | 13.39 | 0.89 | 0.00 | 3.57 | 0.00 |
Middle clouds | 0.00 | 0.21 | 99.71 | 0.00 | 0.00 | 0.08 | 0.00 |
Thin cirrus | 0.00 | 1.45 | 0.00 | 97.40 | 1.16 | 0.00 | 0.00 |
Thick cirrus | 0.00 | 0.00 | 0.00 | 1.25 | 75.00 | 3.75 | 20.00 |
Multi-layer clouds | 0.00 | 7.09 | 15.82 | 0.00 | 0.00 | 77.08 | 0.00 |
Cumulonimbus | 0.00 | 0.00 | 0.00 | 0.00 | 12.50 | 0.00 | 87.50 |
Classifier 2 (Nighttime) | Clear | Low Clouds | Middle Clouds | Thin Cirrus | Thick Cirrus | Multi-Layer Clouds | Cumulonimbus |
---|---|---|---|---|---|---|---|
Clear | 100.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
Low clouds | 0.00 | 52.94 | 0.00 | 0.00 | 14.71 | 0.00 | 32.35 |
Middle clouds | 0.00 | 0.00 | 84.38 | 3.13 | 12.50 | 0.00 | 0.00 |
Thin cirrus | 0.00 | 0.00 | 0.00 | 100.00 | 0.00 | 0.00 | 0.00 |
Thick cirrus | 33.80 | 0.00 | 0.00 | 0.00 | 47.89 | 0.00 | 18.31 |
Multi-layer clouds | 0.00 | 0.00 | 0.00 | 1.72 | 0.00 | 98.28 | 0.00 |
Cumulonimbus | 4.27 | 8.55 | 3.42 | 4.27 | 9.40 | 0.00 | 70.09 |
CLT Cloud Type | Cloud Classification Result |
---|---|
Clear | Clear |
Water, super cooled | Low clouds |
Mixed | Middle clouds |
Cirrus | Thin cirrus |
Ice | Thick cirrus, cumulonimbus |
Overlap | Multi-layer clouds |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, Y.; Hu, C.; Ding, Z.; Wang, Z.; Tang, X. All-Day Cloud Classification via a Random Forest Algorithm Based on Satellite Data from CloudSat and Himawari-8. Atmosphere 2023, 14, 1410. https://doi.org/10.3390/atmos14091410
Wang Y, Hu C, Ding Z, Wang Z, Tang X. All-Day Cloud Classification via a Random Forest Algorithm Based on Satellite Data from CloudSat and Himawari-8. Atmosphere. 2023; 14(9):1410. https://doi.org/10.3390/atmos14091410
Chicago/Turabian StyleWang, Yuanmou, Chunmei Hu, Zhi Ding, Zhiyi Wang, and Xuguang Tang. 2023. "All-Day Cloud Classification via a Random Forest Algorithm Based on Satellite Data from CloudSat and Himawari-8" Atmosphere 14, no. 9: 1410. https://doi.org/10.3390/atmos14091410
APA StyleWang, Y., Hu, C., Ding, Z., Wang, Z., & Tang, X. (2023). All-Day Cloud Classification via a Random Forest Algorithm Based on Satellite Data from CloudSat and Himawari-8. Atmosphere, 14(9), 1410. https://doi.org/10.3390/atmos14091410